- DataSet's __init__ takes a function as argument, rather than class object
- Preprocessor is about to remove. Don't use anymore.
- Remove cross_validate in trainer, because it is rarely used and wired
- Loader.load is expected to be a static method
- Delete sth. in other_modules.py
- Add more tests
- Delete extra sample data
2. changes of names:
aggregation ----> aggregator
interaction ----> interactor
action.py ----> sampler.py
BasePreprocess ---> Preprocessor
BaseTester ----> Tester
BaseTrainer ----> Trainer
3. add more code comments
4. fix bugs in predictor's data_forward
5. in sampler.py, remove Bachifier, fix some codes. but not test
6. remove unused codes in other_modules.py & utils.py
7. update fastnlp.py with new config file names and code comments
8. add data examples in data_for_tests/
- apply DataSet in Predictor; remove sub-predictors; add "task" argument to specify which task to predict, as how Trainer/Tester did.
- remove Action class
- add helper function for DataSet, to create DataSet easily
- more code comments
- clean up unnecessary codes
- add unit tests for Batch, Predictor, Preprocessor, Trainer, Tester
- update LabelField's to_tensor method to support int & str single label
- update preprocessor's convert_to_dataset method to support single label inputs
- introduce "task" in Trainer/Tester's data_forward, Tester's evaluate and metrics methods
- in cnn_text_classification.py, change the name of the argument of forward
- in sequence_modeling.py, change the name of the argument of forward
- minor adjustments in test codes
- text_classify.py works
- add DataSet, Instance, Field to represent data in different levels
- encapsulate batching method in Batch class
- modify samplers in action.py to fit Batch
- preprocessor.run returns DataSet, instead of list
- Use Batch in Trainer/Tester
- add required_arg "task" in Trainer/Tester
- remove SeqLabelTrainer/SeqLabelTester dependencies successfully. They empty classes to deprecate.
- modify SeqLabeling model, add another argument in forward, in order to compute mask inside model
- test\model\seq_labeling.py works
[add] PeopleDailyCorpusLoader, to parse PeopleDaily Corpus
[update] add CWS + POS_tag interface at FastNLP, see example in test_fastNLP.py
[update] modify README.md and readme_example.py to the latest version.
1. Tester has a parameter "print_every_step" to control printing. print_every_step == 0 means NO print.
2. Tester's evaluate return (list of) floats, rather than torch.cuda.tensor
3. Trainer also has a parameter "print_every_step". The same usage.
4. In training, validation steps are not shown.
5. Updates to code comments.
6. fastnlp.py is ready for CWS. test_fastNLP.py works.
- specify the name of the config file and the name of corresponding section where model init params store.
- fastnlp.py needs load_pickle to get dictionary size and the number of labels
- other minor adjustments
- add Loss, Optimizer
- change Trainer & Tester initialization interface: two styles of definition provided
- handle Optimizer construction and loss function definition in a hard manner
- add argparse in task-specific scripts. (seq_labeling.py & text_classify.py)
- seq_labeling.py & text_classify.py work
- move preprocess.py from loader/ to core/
- changes to interface of preprocess: 1. add run method, to run the main processing 2. add cross validation split 3. add return value 4. merge subclasses
- Trainer supports cross validation
- add data as arguments in Trainer.train & Tester.test
- add readme.example.py, to run the example program shown in README.md
- other corresponding changes
- see fastNLp/saver/logger.py to know how to create and use a logger
- a log file named "train_test.log" will be created in the same dir as the main file where the program starts
- this file records all important events happened in Trainer & Tester's methods
- rename Inference to Predictor
- rename Trainer.prepare_input to Trainer.load_train_data, load data_train.pkl only
- add __contains__ method to config Section class
- more code comments
- more elegant make_batch & data_iterator: Samplers return batch samples instead of batch indices