diff options
author | Gustaf Rydholm <gustaf.rydholm@gmail.com> | 2021-05-13 23:02:42 +0200 |
---|---|---|
committer | Gustaf Rydholm <gustaf.rydholm@gmail.com> | 2021-05-13 23:02:42 +0200 |
commit | f4688482b4898c0b342d6ae59839dc27fbf856c6 (patch) | |
tree | a88a853a105a72397f3d6684a35c33a5da3536a8 /README.md | |
parent | 8c7768e8d321efec558e12bff9b89b2de615d541 (diff) |
Remove bloat packages
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 10 |
1 files changed, 5 insertions, 5 deletions
@@ -11,24 +11,24 @@ TBC Extract text from the iam dataset: ``` -poetry run python extract-iam-text --use_words --save_text train.txt --save_tokens letters.txt +python extract-iam-text --use_words --save_text train.txt --save_tokens letters.txt ``` Create word pieces from the extracted training text: ``` -poetry run python make-wordpieces --output_prefix iamdb_1kwp --text_file train.txt --num_pieces 100 +python make-wordpieces --output_prefix iamdb_1kwp --text_file train.txt --num_pieces 100 ``` Optionally, build a transition graph for word pieces: ``` -poetry run python build-transitions --tokens iamdb_1kwp_tokens_1000.txt --lexicon iamdb_1kwp_lex_1000.txt --blank optional --self_loops --save_path 1kwp_prune_0_10_optblank.bin --prune 0 10 +python build-transitions --tokens iamdb_1kwp_tokens_1000.txt --lexicon iamdb_1kwp_lex_1000.txt --blank optional --self_loops --save_path 1kwp_prune_0_10_optblank.bin --prune 0 10 ``` (TODO: Not working atm, needed for GTN loss function) ## Todo - [ ] Reimplement transformer from scratch -- [ ] Implement Nyström attention (for efficient attention) -- [ ] Dino +- [x] Implement Nyström attention (for efficient attention) +- [ ] Implement Dino - [ ] Efficient-net b0 + transformer decoder - [ ] Test encoder pre-training ViT (CvT?) with Dino, then train decoder in a separate step |