diff --git a/tutorials/fastnlp_tutorial_0.ipynb b/tutorials/fastnlp_tutorial_0.ipynb index 26675ecf..4368652a 100644 --- a/tutorials/fastnlp_tutorial_0.ipynb +++ b/tutorials/fastnlp_tutorial_0.ipynb @@ -86,9 +86,11 @@ "\n", "  具体`driver`与`Trainer`以及`Evaluator`之间的关系请参考`fastNLP 0.8`的框架设计\n", "\n", - "注:在同一脚本中,`Trainer`和`Evaluator`使用的`driver`应当保持一致\n", + "注:这里给出一条建议:**在同一脚本中**,**所有的`Trainer`和`Evaluator`使用的`driver`应当保持一致**\n", "\n", - "  一个不能违背的原则在于:**不要将多卡的`driver`前使用单卡的`driver`**(???),这样使用可能会带来很多意想不到的错误" + "  尽量不出现,之前使用单卡的`driver`,后面又使用多卡的`driver`,这是因为,当脚本执行至\n", + "\n", + "  多卡`driver`处时,会重启一个进程执行之前所有内容,如此一来可能会造成一些意想不到的麻烦" ] }, { @@ -167,7 +169,7 @@ "\n", "注:在`fastNLP 0.8`中,**`Trainer`要求模型通过`train_step`来返回一个字典**,**满足如`{\"loss\": loss}`的形式**\n", "\n", - "  此外,这里也可以通过传入`Trainer`的参数`output_mapping`来实现高度化的定制,具体请见这一note(???)\n", + "  此外,这里也可以通过传入`Trainer`的参数`output_mapping`来实现输出的转换,详见(trainer的详细讲解,待补充)\n", "\n", "同样,在`fastNLP 0.8`中,**函数`evaluate_step`是`Evaluator`中参数`evaluate_fn`的默认值**\n", "\n", @@ -177,7 +179,7 @@ "\n", "  从模块角度,该字典的键值和`metric`中的`update`函数的签名一致,这样的机制在传参时被称为“**参数匹配**”\n", "\n", - "" + "" ] }, { @@ -216,8 +218,14 @@ "\n", " def __getitem__(self, item):\n", " return {\"x\": self.x[item], \"y\": self.y[item]}\n", - "```\n", - "***\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "f5f1a6aa", + "metadata": {}, + "source": [ "对于后者,首先要明确,在`Trainer`和`Evaluator`中,`metrics`的计算分为`update`和`get_metric`两步\n", "\n", "    **`update`函数**,**针对一个`batch`的预测结果**,计算其累计的评价指标\n", @@ -230,7 +238,9 @@ "\n", "  在此基础上,**`fastNLP 0.8`要求`evaluate_dataloader`生成的每个`batch`传递给对应的`metric`**\n", "\n", - "    **以`{\"pred\": y_pred, \"target\": y_true}`的形式**,对应其`update`函数的函数签名" + "    **以`{\"pred\": y_pred, \"target\": y_true}`的形式**,对应其`update`函数的函数签名\n", + "\n", + "" ] }, { @@ -639,11 +649,11 @@ { "data": { "text/html": [ - "
{'acc#acc': 0.29}\n",
+       "
{'acc#acc': 0.39}\n",
        "
\n" ], "text/plain": [ - "\u001b[1m{\u001b[0m\u001b[32m'acc#acc'\u001b[0m: \u001b[1;36m0.29\u001b[0m\u001b[1m}\u001b[0m\n" + "\u001b[1m{\u001b[0m\u001b[32m'acc#acc'\u001b[0m: \u001b[1;36m0.39\u001b[0m\u001b[1m}\u001b[0m\n" ] }, "metadata": {}, @@ -652,7 +662,7 @@ { "data": { "text/plain": [ - "{'acc#acc': 0.29}" + "{'acc#acc': 0.39}" ] }, "execution_count": 9, @@ -710,7 +720,9 @@ "source": [ "通过使用`Trainer`类的`run`函数,进行训练\n", "\n", - "  还可以通过参数`num_eval_sanity_batch`决定每次训练前运行多少个`evaluate_batch`进行评测,默认为2" + "  还可以通过参数`num_eval_sanity_batch`决定每次训练前运行多少个`evaluate_batch`进行评测,默认为2\n", + "\n", + "  之所以“先评测后训练”,是为了保证训练很长时间的数据,不会在评测阶段出问题,故作此试探性评测" ] }, { @@ -773,6 +785,14 @@ "source": [ "trainer.run()" ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c4e9c619", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { diff --git a/tutorials/fastnlp_tutorial_1.ipynb b/tutorials/fastnlp_tutorial_1.ipynb index 93e7a664..c378b54a 100644 --- a/tutorials/fastnlp_tutorial_1.ipynb +++ b/tutorials/fastnlp_tutorial_1.ipynb @@ -153,7 +153,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "2438703969992 2438374526920\n", + "1608199516936 1607874531400\n", "+-----+------------------------+------------------------+-----+\n", "| idx | sentence | words | num |\n", "+-----+------------------------+------------------------+-----+\n", @@ -183,7 +183,7 @@ "id": "aa277674", "metadata": {}, "source": [ - "  注二:在`fastNLP 0.8`中,**对`dataset`使用等号**,**其效果是传引用**,**而不是赋值**(???)\n", + "  注二:**对对象使用等号一般表示传引用**,所以对`dataset`使用等号,是传引用而不是赋值\n", "\n", "    如下所示,**`dropped`和`dataset`具有相同`id`**,**对`dropped`执行删除操作`dataset`同时会被修改**" ] @@ -198,7 +198,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "2438374526920 2438374526920\n", + "1607874531400 1607874531400\n", "+-----+------------------------+------------------------+-----+\n", "| idx | sentence | words | num |\n", "+-----+------------------------+------------------------+-----+\n", @@ -296,9 +296,9 @@ "\n", "在`dataset`模块中,`apply`、`apply_field`、`apply_more`和`apply_field_more`函数可以进行简单的数据预处理\n", "\n", - "  **`apply`和`apply_more`针对整条实例**,**`apply_field`和`apply_field_more`仅针对实例的部分字段**\n", + "  **`apply`和`apply_more`输入整条实例**,**`apply_field`和`apply_field_more`仅输入实例的部分字段**\n", "\n", - "  **`apply`和`apply_field`仅针对单个字段**,**`apply_more`和`apply_field_more`则可以针对多个字段**\n", + "  **`apply`和`apply_field`仅输出单个字段**,**`apply_more`和`apply_field_more`则是输出多个字段**\n", "\n", "  **`apply`和`apply_field`返回的是个列表**,**`apply_more`和`apply_field_more`返回的是个字典**\n", "\n", @@ -311,14 +311,14 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "id": "72a0b5f9", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "", + "model_id": "8532c5609a394c19b60315663a6f0f4a", "version_major": 2, "version_minor": 0 }, @@ -328,42 +328,6 @@ }, "metadata": {}, "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+-----+------------------------------+------------------------------+\n", - "| idx | sentence | words |\n", - "+-----+------------------------------+------------------------------+\n", - "| 0 | This is an apple . | ['This', 'is', 'an', 'app... |\n", - "| 1 | I like apples . | ['I', 'like', 'apples', '... |\n", - "| 2 | Apples are good for our h... | ['Apples', 'are', 'good',... |\n", - "+-----+------------------------------+------------------------------+\n" - ] } ], "source": [ @@ -384,57 +348,10 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "id": "b1a8631f", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+-----+------------------------------+------------------------------+\n", - "| idx | sentence | words |\n", - "+-----+------------------------------+------------------------------+\n", - "| 0 | This is an apple . | ['This', 'is', 'an', 'app... |\n", - "| 1 | I like apples . | ['I', 'like', 'apples', '... |\n", - "| 2 | Apples are good for our h... | ['Apples', 'are', 'good',... |\n", - "+-----+------------------------------+------------------------------+\n" - ] - } - ], + "outputs": [], "source": [ "dataset = DataSet(data)\n", "\n", @@ -459,57 +376,10 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "057c1d2c", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+-----+------------------------------+------------------------------+\n", - "| idx | sentence | words |\n", - "+-----+------------------------------+------------------------------+\n", - "| 0 | This is an apple . | ['This', 'is', 'an', 'app... |\n", - "| 1 | I like apples . | ['I', 'like', 'apples', '... |\n", - "| 2 | Apples are good for our h... | ['Apples', 'are', 'good',... |\n", - "+-----+------------------------------+------------------------------+\n" - ] - } - ], + "outputs": [], "source": [ "dataset = DataSet(data)\n", "dataset.apply_field(lambda sent:sent.split(), field_name='sentence', new_field_name='words')\n", @@ -528,57 +398,10 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "id": "51e2f02c", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+-----+------------------------+------------------------+-----+\n", - "| idx | sentence | words | num |\n", - "+-----+------------------------+------------------------+-----+\n", - "| 0 | This is an apple . | ['This', 'is', 'an'... | 5 |\n", - "| 1 | I like apples . | ['I', 'like', 'appl... | 4 |\n", - "| 2 | Apples are good for... | ['Apples', 'are', '... | 7 |\n", - "+-----+------------------------+------------------------+-----+\n" - ] - } - ], + "outputs": [], "source": [ "dataset = DataSet(data)\n", "dataset.apply_more(lambda ins:{'words': ins['sentence'].split(), 'num': len(ins['sentence'].split())})\n", @@ -597,57 +420,10 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "id": "db4295d5", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+-----+------------------------+------------------------+-----+\n", - "| idx | sentence | words | num |\n", - "+-----+------------------------+------------------------+-----+\n", - "| 0 | This is an apple . | ['This', 'is', 'an'... | 5 |\n", - "| 1 | I like apples . | ['I', 'like', 'appl... | 4 |\n", - "| 2 | Apples are good for... | ['Apples', 'are', '... | 7 |\n", - "+-----+------------------------+------------------------+-----+\n" - ] - } - ], + "outputs": [], "source": [ "dataset = DataSet(data)\n", "dataset.apply_field_more(lambda sent:{'words': sent.split(), 'num': len(sent.split())}, \n", @@ -669,7 +445,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "id": "012f537c", "metadata": {}, "outputs": [], @@ -700,20 +476,10 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "id": "a4c1c10d", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "dict_items([('sentence', 'This is an apple .'), ('words', ['This', 'is', 'an', 'apple', '.']), ('num', 5)])\n", - "dict_keys(['sentence', 'words', 'num'])\n", - "dict_values(['This is an apple .', ['This', 'is', 'an', 'apple', '.'], 5])\n" - ] - } - ], + "outputs": [], "source": [ "ins = Instance(sentence=\"This is an apple .\", words=['This', 'is', 'an', 'apple', '.'], num=5)\n", "\n", @@ -732,22 +498,10 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "id": "55376402", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+--------------------+------------------------+-----+-----+\n", - "| sentence | words | num | idx |\n", - "+--------------------+------------------------+-----+-----+\n", - "| This is an apple . | ['This', 'is', 'an'... | 5 | 0 |\n", - "+--------------------+------------------------+-----+-----+\n" - ] - } - ], + "outputs": [], "source": [ "ins.add_field(field_name='idx', field=0)\n", "print(ins)" @@ -767,44 +521,20 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "id": "fe15f4c1", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'sentence': ,\n", - " 'words': ,\n", - " 'num': }" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "dataset.get_all_fields()" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "id": "5433815c", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "['num', 'sentence', 'words']" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "dataset.get_field_names()" ] @@ -823,29 +553,10 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": null, "id": "25ce5488", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "3 False\n", - "6 True\n", - "+------------------------------+------------------------------+--------+\n", - "| sentence | words | length |\n", - "+------------------------------+------------------------------+--------+\n", - "| This is an apple . | ['This', 'is', 'an', 'app... | 5 |\n", - "| I like apples . | ['I', 'like', 'apples', '... | 4 |\n", - "| Apples are good for our h... | ['Apples', 'are', 'good',... | 7 |\n", - "| This is an apple . | ['This', 'is', 'an', 'app... | 5 |\n", - "| I like apples . | ['I', 'like', 'apples', '... | 4 |\n", - "| Apples are good for our h... | ['Apples', 'are', 'good',... | 7 |\n", - "+------------------------------+------------------------------+--------+\n" - ] - } - ], + "outputs": [], "source": [ "print(len(dataset), dataset.has_field('length')) \n", "if 'num' in dataset:\n", @@ -877,21 +588,10 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": null, "id": "3515e096", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Vocabulary([]...)\n", - "{'': 0, '': 1}\n", - " 0\n", - " 1\n" - ] - } - ], + "outputs": [], "source": [ "from fastNLP.core.vocabulary import Vocabulary\n", "\n", @@ -914,20 +614,10 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "id": "88c7472a", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "5 Counter({'生活': 1, '就像': 1, '海洋': 1})\n", - "6 Counter({'生活': 1, '就像': 1, '海洋': 1, '只有': 1})\n", - "6 {'': 0, '': 1, '生活': 2, '就像': 3, '海洋': 4, '只有': 5}\n" - ] - } - ], + "outputs": [], "source": [ "vocab.add_word_lst(['生活', '就像', '海洋'])\n", "print(len(vocab), vocab.word_count)\n", @@ -950,21 +640,10 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": null, "id": "3447acde", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0\n", - " 1\n", - "生活 2\n", - "彼岸 1 False\n" - ] - } - ], + "outputs": [], "source": [ "print(vocab.to_word(0), vocab.to_index(''))\n", "print(vocab.to_word(1), vocab.to_index(''))\n", @@ -986,21 +665,10 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": null, "id": "490b101c", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "生活 2\n", - "彼岸 12 True\n", - "13 Counter({'人': 4, '生活': 2, '就像': 2, '海洋': 2, '只有': 2, '意志': 1, '坚强的': 1, '才': 1, '能': 1, '到达': 1, '彼岸': 1})\n", - "13 {'': 0, '': 1, '生活': 2, '就像': 3, '海洋': 4, '只有': 5, '人': 6, '意志': 7, '坚强的': 8, '才': 9, '能': 10, '到达': 11, '彼岸': 12}\n" - ] - } - ], + "outputs": [], "source": [ "vocab.add_word_lst(['生活', '就像', '海洋', '只有', '意志', '坚强的', '人', '人', '人', '人', '才', '能', '到达', '彼岸'])\n", "print(vocab.to_word(2), vocab.to_index('生活'))\n", @@ -1023,19 +691,10 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": null, "id": "a99ff909", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'positive': 0, 'negative': 1}\n", - "ValueError: word `neutral` not in vocabulary\n" - ] - } - ], + "outputs": [], "source": [ "vocab = Vocabulary(unknown=None, padding=None)\n", "\n", @@ -1058,19 +717,10 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "id": "432f74c1", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'': 0, 'positive': 1, 'negative': 2}\n", - "0 \n" - ] - } - ], + "outputs": [], "source": [ "vocab = Vocabulary(unknown='', padding=None)\n", "\n", @@ -1096,92 +746,10 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "id": "3dbd985d", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
SentenceIdSentenceSentiment
01A series of escapades demonstrating the adage ...negative
12This quiet , introspective and entertaining in...positive
23Even fans of Ismail Merchant 's work , I suspe...negative
34A positively thrilling combination of ethnogra...neutral
45A comedy-drama of nearly epic proportions root...positive
56The Importance of Being Earnest , so thick wit...neutral
\n", - "
" - ], - "text/plain": [ - " SentenceId Sentence Sentiment\n", - "0 1 A series of escapades demonstrating the adage ... negative\n", - "1 2 This quiet , introspective and entertaining in... positive\n", - "2 3 Even fans of Ismail Merchant 's work , I suspe... negative\n", - "3 4 A positively thrilling combination of ethnogra... neutral\n", - "4 5 A comedy-drama of nearly epic proportions root... positive\n", - "5 6 The Importance of Being Earnest , so thick wit... neutral" - ] - }, - "execution_count": 24, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "import pandas as pd\n", "\n", @@ -1199,60 +767,10 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": null, "id": "4f634586", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+------------+------------------------------+-----------+\n", - "| SentenceId | Sentence | Sentiment |\n", - "+------------+------------------------------+-----------+\n", - "| 1 | ['a', 'series', 'of', 'es... | negative |\n", - "| 2 | ['this', 'quiet', ',', 'i... | positive |\n", - "| 3 | ['even', 'fans', 'of', 'i... | negative |\n", - "| 4 | ['a', 'positively', 'thri... | neutral |\n", - "| 5 | ['a', 'comedy-drama', 'of... | positive |\n", - "| 6 | ['the', 'importance', 'of... | neutral |\n", - "+------------+------------------------------+-----------+\n" - ] - } - ], + "outputs": [], "source": [ "from fastNLP.core.dataset import DataSet\n", "\n", @@ -1273,7 +791,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": null, "id": "46722efc", "metadata": {}, "outputs": [], @@ -1297,55 +815,10 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": null, "id": "a2de615b", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Counter({'a': 9, 'of': 9, ',': 7, 'the': 6, '.': 5, 'is': 3, 'and': 3, 'good': 2, 'for': 2, 'which': 2, 'this': 2, \"'s\": 2, 'series': 1, 'escapades': 1, 'demonstrating': 1, 'adage': 1, 'that': 1, 'what': 1, 'goose': 1, 'also': 1, 'gander': 1, 'some': 1, 'occasionally': 1, 'amuses': 1, 'but': 1, 'none': 1, 'amounts': 1, 'to': 1, 'much': 1, 'story': 1, 'quiet': 1, 'introspective': 1, 'entertaining': 1, 'independent': 1, 'worth': 1, 'seeking': 1, 'even': 1, 'fans': 1, 'ismail': 1, 'merchant': 1, 'work': 1, 'i': 1, 'suspect': 1, 'would': 1, 'have': 1, 'hard': 1, 'time': 1, 'sitting': 1, 'through': 1, 'one': 1, 'positively': 1, 'thrilling': 1, 'combination': 1, 'ethnography': 1, 'all': 1, 'intrigue': 1, 'betrayal': 1, 'deceit': 1, 'murder': 1, 'shakespearean': 1, 'tragedy': 1, 'or': 1, 'juicy': 1, 'soap': 1, 'opera': 1, 'comedy-drama': 1, 'nearly': 1, 'epic': 1, 'proportions': 1, 'rooted': 1, 'in': 1, 'sincere': 1, 'performance': 1, 'by': 1, 'title': 1, 'character': 1, 'undergoing': 1, 'midlife': 1, 'crisis': 1, 'importance': 1, 'being': 1, 'earnest': 1, 'so': 1, 'thick': 1, 'with': 1, 'wit': 1, 'it': 1, 'plays': 1, 'like': 1, 'reading': 1, 'from': 1, 'bartlett': 1, 'familiar': 1, 'quotations': 1}) \n", - "\n", - "{'': 0, '': 1, 'a': 2, 'of': 3, ',': 4, 'the': 5, '.': 6, 'is': 7, 'and': 8, 'good': 9, 'for': 10, 'which': 11, 'this': 12, \"'s\": 13, 'series': 14, 'escapades': 15, 'demonstrating': 16, 'adage': 17, 'that': 18, 'what': 19, 'goose': 20, 'also': 21, 'gander': 22, 'some': 23, 'occasionally': 24, 'amuses': 25, 'but': 26, 'none': 27, 'amounts': 28, 'to': 29, 'much': 30, 'story': 31, 'quiet': 32, 'introspective': 33, 'entertaining': 34, 'independent': 35, 'worth': 36, 'seeking': 37, 'even': 38, 'fans': 39, 'ismail': 40, 'merchant': 41, 'work': 42, 'i': 43, 'suspect': 44, 'would': 45, 'have': 46, 'hard': 47, 'time': 48, 'sitting': 49, 'through': 50, 'one': 51, 'positively': 52, 'thrilling': 53, 'combination': 54, 'ethnography': 55, 'all': 56, 'intrigue': 57, 'betrayal': 58, 'deceit': 59, 'murder': 60, 'shakespearean': 61, 'tragedy': 62, 'or': 63, 'juicy': 64, 'soap': 65, 'opera': 66, 'comedy-drama': 67, 'nearly': 68, 'epic': 69, 'proportions': 70, 'rooted': 71, 'in': 72, 'sincere': 73, 'performance': 74, 'by': 75, 'title': 76, 'character': 77, 'undergoing': 78, 'midlife': 79, 'crisis': 80, 'importance': 81, 'being': 82, 'earnest': 83, 'so': 84, 'thick': 85, 'with': 86, 'wit': 87, 'it': 88, 'plays': 89, 'like': 90, 'reading': 91, 'from': 92, 'bartlett': 93, 'familiar': 94, 'quotations': 95} \n", - "\n", - "Vocabulary(['a', 'series', 'of', 'escapades', 'demonstrating']...)\n" - ] - } - ], + "outputs": [], "source": [ "from fastNLP.core.vocabulary import Vocabulary\n", "\n", @@ -1368,60 +841,10 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": null, "id": "2f9a04b2", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+------------+------------------------------+-----------+\n", - "| SentenceId | Sentence | Sentiment |\n", - "+------------+------------------------------+-----------+\n", - "| 1 | [2, 14, 3, 15, 16, 5, 17,... | negative |\n", - "| 2 | [12, 32, 4, 33, 8, 34, 35... | positive |\n", - "| 3 | [38, 39, 3, 40, 41, 13, 4... | negative |\n", - "| 4 | [2, 52, 53, 54, 3, 55, 8,... | neutral |\n", - "| 5 | [2, 67, 3, 68, 69, 70, 71... | positive |\n", - "| 6 | [5, 81, 3, 82, 83, 4, 84,... | neutral |\n", - "+------------+------------------------------+-----------+\n" - ] - } - ], + "outputs": [], "source": [ "vocab.index_dataset(dataset, field_name='Sentence')\n", "print(dataset)" @@ -1437,67 +860,10 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": null, "id": "5f5eed18", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'negative': 0, 'positive': 1, 'neutral': 2}\n"
-     ]
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "+------------+------------------------------+-----------+\n", - "| SentenceId | Sentence | Sentiment |\n", - "+------------+------------------------------+-----------+\n", - "| 1 | [2, 14, 3, 15, 16, 5, 17,... | 0 |\n", - "| 2 | [12, 32, 4, 33, 8, 34, 35... | 1 |\n", - "| 3 | [38, 39, 3, 40, 41, 13, 4... | 0 |\n", - "| 4 | [2, 52, 53, 54, 3, 55, 8,... | 2 |\n", - "| 5 | [2, 67, 3, 68, 69, 70, 71... | 1 |\n", - "| 6 | [5, 81, 3, 82, 83, 4, 84,... | 2 |\n", - "+------------+------------------------------+-----------+\n" - ] - } - ], + "outputs": [], "source": [ "target_vocab = Vocabulary(padding=None, unknown=None)\n", "\n", diff --git a/tutorials/figures/T0-fig-parameter-matching.png b/tutorials/figures/T0-fig-parameter-matching.png new file mode 100644 index 00000000..410256ae Binary files /dev/null and b/tutorials/figures/T0-fig-parameter-matching.png differ diff --git a/tutorials/figures/T0-fig-trainer-and-evaluator.png b/tutorials/figures/T0-fig-trainer-and-evaluator.png index 6e95650d..38222ee8 100644 Binary files a/tutorials/figures/T0-fig-trainer-and-evaluator.png and b/tutorials/figures/T0-fig-trainer-and-evaluator.png differ diff --git a/tutorials/figures/T0-fig-training-structure.png b/tutorials/figures/T0-fig-training-structure.png new file mode 100644 index 00000000..6569f3d4 Binary files /dev/null and b/tutorials/figures/T0-fig-training-structure.png differ