Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
yh_cc 584a92c64c | 5 years ago | |
---|---|---|
.. | ||
readme.md | 5 years ago | |
train_bert.py | 5 years ago | |
train_cn_ner.py | 5 years ago |
使用以下中文NERPipe自动下载的统计数据
MsraNERPipe | # of sents | # of tokens |
---|---|---|
train | 41747 | 1954374 |
dev | 4617 | 215505 |
test | 4365 | 172601 |
total | 50729 | 2342480 |
这里报道的统计数据,与https://arxiv.org/pdf/1805.02023.pdf报道的一致 |
WeiboNERPipe | # of sents | # of tokens |
---|---|---|
train | 1350 | 73778 |
dev | 270 | 14509 |
test | 270 | 14842 |
total | 1890 | 1890 |
这里报道的统计数据与https://www.cs.cmu.edu/~ark/EMNLP-2015/proceedings/EMNLP/pdf/EMNLP064.pdf一致 |
PeopleDailyPipe | # of sents | # of tokens |
---|---|---|
train | 50658 | 2169879 |
dev | 4631 | 172601 |
test | 68 | 2270 |
total | 55357 | 2344750 |
这里使用的数据与https://arxiv.org/pdf/1906.08101.pdf的数据是一致的 |
一款轻量级的自然语言处理(NLP)工具包,目标是减少用户项目中的工程型代码,例如数据处理循环、训练循环、多卡运行等
Python Jupyter Notebook Text CSV Markdown