Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
FengZiYjun 501ffb26c5 | 6 years ago | |
---|---|---|
.. | ||
.gitignore | 6 years ago | |
README.md | 6 years ago | |
__init__.py | 6 years ago | |
dataset.py | 6 years ago | |
model.py | 6 years ago | |
train.py | 6 years ago |
This is the implementation of Convolutional Neural Networks for Sentence Classification paper in PyTorch.
STEP 1
install packages like gensim (other needed pakages is the same)
pip install gensim
STEP 2
install MRdataset and word2vec resources
Since this file is more than 1.5G, I did not display in folders. If you download the file, please remember modify the path in Function def word_embeddings(path = './GoogleNews-vectors-negative300.bin/'):
STEP 3
train the model
python train.py
you will get the information printed in the screen, like
Epoch [1/20], Iter [100/192] Loss: 0.7008
Test Accuracy: 71.869159 %
Epoch [2/20], Iter [100/192] Loss: 0.5957
Test Accuracy: 75.700935 %
Epoch [3/20], Iter [100/192] Loss: 0.4934
Test Accuracy: 78.130841 %
......
Epoch [20/20], Iter [100/192] Loss: 0.0364
Test Accuracy: 81.495327 %
Best Accuracy: 82.616822 %
Best Model: models/cnn.pkl
According to the paper and experiment, I set:
Epoch | Kernel Size | dropout | learning rate | batch size |
---|---|---|---|---|
20 | (h,300,100) | 0.5 | 0.0001 | 50 |
h = [3,4,5]
If the accuracy is not improved, the learning rate will *0.8.
I just tried one dataset : MR. (Other 6 dataset in paper SST-1, SST-2, TREC, CR, MPQA)
There are four models in paper: CNN-rand, CNN-static, CNN-non-static, CNN-multichannel.
I have tried CNN-non-static:A model with pre-trained vectors from word2vec.
All words—including the unknown ones that are randomly initialized and the pretrained vectors are fine-tuned for each task
(which has almost the best performance and the most difficut to implement among the four models)
Dataset | Class Size | Best Result | Kim's Paper Result |
---|---|---|---|
MR | 2 | 82.617%(CNN-non-static) | 81.5%(CNN-nonstatic) |
一款轻量级的自然语言处理(NLP)工具包,目标是减少用户项目中的工程型代码,例如数据处理循环、训练循环、多卡运行等
Python Jupyter Notebook Text CSV Markdown