update example-12 lxr 220531

3 years ago · e55ca8a1d1
--- a/tutorials/fastnlp_tutorial_2.ipynb
+++ b/tutorials/fastnlp_tutorial_2.ipynb
@@ -867,7 +867,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
   "version": "3.7.13"
  },
  "pycharm": {
   "stem_cell": {
--- a/tutorials/fastnlp_tutorial_3.ipynb
+++ b/tutorials/fastnlp_tutorial_3.ipynb
@@ -29,13 +29,7 @@
    "\n",
    "### 1.1 dataloader 的职责描述\n",
    "\n",
    "在`fastNLP 0.8`中，在数据加载模块`DataLoader`之前，还存在其他的一些模块，负责例如对文本数据\n",
    "\n",
    "&emsp; 进行补零对齐，即 **核对器`collator`模块**，进行分词标注，即 **分词器`tokenizer`模块**\n",
    "\n",
    "&emsp; 本节将对`fastNLP`中的核对器`collator`等展开介绍，分词器`tokenizer`将在下一节中详细介绍\n",
    "\n",
    "在`fastNLP 0.8`中，**核对器`collator`模块负责文本序列的补零对齐**，通过"
    "在`fastNLP 0.8`中，在数据加载模块`DataLoader`之前"
   ]
  },
  {
@@ -45,13 +39,7 @@
   "source": [
    "### 1.2 dataloader 的基本使用\n",
    "\n",
    "在`fastNLP 0.8`中，在数据加载模块`DataLoader`之前，还存在其他的一些模块，负责例如对文本数据\n",
    "\n",
    "&emsp; 进行补零对齐，即 **核对器`collator`模块**，进行分词标注，即 **分词器`tokenizer`模块**\n",
    "\n",
    "&emsp; 本节将对`fastNLP`中的核对器`collator`等展开介绍，分词器`tokenizer`将在下一节中详细介绍\n",
    "\n",
    "在`fastNLP 0.8`中，**核对器`collator`模块负责文本序列的补零对齐**，通过"
    "在`fastNLP 0.8`中，在数据加载模块`DataLoader`之前，"
   ]
  },
  {
--- a/tutorials/fastnlp_tutorial_4.ipynb
+++ b/tutorials/fastnlp_tutorial_4.ipynb
@@ -5,21 +5,21 @@
   "id": "fdd7ff16",
   "metadata": {},
   "source": [
    "# T4. model 的搭建与 driver 的概念\n",
    "# T4. trainer 和 evaluator 的深入介绍（一）\n",
    "\n",
    "&emsp; 1 &ensp; fastNLP 中预训练模型的使用\n",
    "&emsp; 1 &ensp; fastNLP 结合 pytorch 搭建模型\n",
    " \n",
    "&emsp; &emsp; 1.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 1.2 &ensp; \n",
    "\n",
    "&emsp; 2 &ensp; fastNLP 中使用 Pytorch 搭建模型\n",
    "&emsp; 2 &ensp; fastNLP 中的 driver 与 device\n",
    "\n",
    "&emsp; &emsp; 2.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.2 &ensp; \n",
    "\n",
    "&emsp; 3 &ensp; fastNLP 中的 driver\n",
    "&emsp; 3 &ensp; fastNLP 中 trainer 的补充介绍\n",
    "\n",
    "&emsp; &emsp; 3.1 &ensp; \n",
    "\n",
@@ -51,7 +51,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
--- a/tutorials/fastnlp_tutorial_5.ipynb
+++ b/tutorials/fastnlp_tutorial_5.ipynb
@@ -0,0 +1,59 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fdd7ff16",
   "metadata": {},
   "source": [
    "# T5. fastNLP 与 paddle 或 jittor 的结合\n",
    "\n",
    "&emsp; 1 &ensp; fastNLP 结合 paddle 训练模型\n",
    " \n",
    "&emsp; &emsp; 1.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 1.2 &ensp; \n",
    "\n",
    "&emsp; 2 &ensp; fastNLP 结合 jittor 训练模型\n",
    "\n",
    "&emsp; &emsp; 2.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.2 &ensp; \n",
    "\n",
    "&emsp; 3 &ensp; fastNLP 实现 paddle 与 pytorch 互转\n",
    "\n",
    "&emsp; &emsp; 3.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 3.2 &ensp; "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08752c5a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/tutorials/fastnlp_tutorial_6.ipynb
+++ b/tutorials/fastnlp_tutorial_6.ipynb
@@ -0,0 +1,59 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fdd7ff16",
   "metadata": {},
   "source": [
    "# T6. trainer 和 evaluator 的深入介绍（二）\n",
    "\n",
    "&emsp; 1 &ensp; fastNLP 中预定义模型 models\n",
    " \n",
    "&emsp; &emsp; 1.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 1.2 &ensp; \n",
    "\n",
    "&emsp; 2 &ensp; fastNLP 中预定义模型 modules\n",
    " \n",
    "&emsp; &emsp; 2.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.2 &ensp; \n",
    "\n",
    "&emsp; 3 &ensp; fastNLP 中的更多 metric 类型\n",
    "\n",
    "&emsp; &emsp; 3.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 3.2 &ensp; "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08752c5a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/tutorials/fastnlp_tutorial_7.ipynb
+++ b/tutorials/fastnlp_tutorial_7.ipynb
@@ -0,0 +1,59 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fdd7ff16",
   "metadata": {},
   "source": [
    "# T7. callback 自定义训练过程\n",
    "\n",
    "&emsp; 1 &ensp; \n",
    " \n",
    "&emsp; &emsp; 1.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 1.2 &ensp; \n",
    "\n",
    "&emsp; 2 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.2 &ensp; \n",
    "\n",
    "&emsp; 3 &ensp; \n",
    "\n",
    "&emsp; &emsp; 3.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 3.2 &ensp; "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08752c5a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/tutorials/fastnlp_tutorial_8.ipynb
+++ b/tutorials/fastnlp_tutorial_8.ipynb
@@ -0,0 +1,59 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fdd7ff16",
   "metadata": {},
   "source": [
    "# T8. fastNLP 中的文件读取模块\n",
    "\n",
    "&emsp; 1 &ensp; fastNLP 中的 EmbedLoader 模块\n",
    " \n",
    "&emsp; &emsp; 1.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 1.2 &ensp; \n",
    "\n",
    "&emsp; 2 &ensp; fastNLP 中的 Loader 模块\n",
    "\n",
    "&emsp; &emsp; 2.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 2.2 &ensp; \n",
    "\n",
    "&emsp; 3 &ensp; fastNLP 中的 Pipe 模块\n",
    "\n",
    "&emsp; &emsp; 3.1 &ensp; \n",
    "\n",
    "&emsp; &emsp; 3.2 &ensp; "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08752c5a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/tutorials/fastnlp_tutorial_e1.ipynb
+++ b/tutorials/fastnlp_tutorial_e1.ipynb
@@ -6,7 +6,7 @@
   "source": [
    "&emsp; 从这篇开始，我们将开启**`fastNLP v0.8 tutorial`的`example`系列**，在接下来的\n",
    "\n",
    "&emsp; 每篇`tutorial`里，我们将会介绍`fastNLP v0.8`在一些自然语言处理任务上的应用"
    "&emsp; 每篇`tutorial`里，我们将会介绍`fastNLP v0.8`在自然语言处理任务上的应用实例"
   ]
  },
  {
@@ -82,9 +82,9 @@
    "\n",
    "&emsp; 包含9个数据集，各语料的语言均为英语，涉及多个自然语言理解`NLU`任务，包括\n",
    "\n",
    "&emsp; &emsp; **`CoLA`**，文本分类任务，预测单句语法正误分类；**`SST2`**，文本分类任务，预测单句情感二分类\n",
    "&emsp; &emsp; **`CoLA`**，文本分类任务，预测单句语法正误分类；**`SST-2`**，文本分类任务，预测单句情感二分类\n",
    "\n",
    "&emsp; &emsp; **`MRPC`**，句对分类任务，预测句对语义一致性；**`STSB`**，相似度打分任务，预测句对语义相似度回归\n",
    "&emsp; &emsp; **`MRPC`**，句对分类任务，预测句对语义一致性；**`STS-B`**，相似度打分任务，预测句对语义相似度回归\n",
    "\n",
    "&emsp; &emsp; **`QQP`**，句对分类任务，预测问题对语义一致性；**`MNLI`**，文本推理任务，预测句对蕴含/矛盾/中立预测\n",
    "\n",
@@ -216,15 +216,15 @@
    "\n",
    "&emsp; 即使用较小的、不区分大小写的数据集，**对`bert-base`进行知识蒸馏后的版本**，结构上\n",
    "\n",
    "&emsp; 模型包含1个编码层、6个自注意力层，详解见本篇末尾，更多细节请参考[DistilBert论文](https://arxiv.org/pdf/1910.01108.pdf)\n",
    "&emsp; 包含**1个编码层**、**6个自注意力层**，**参数量`66M`**，详解见本篇末尾，更多请参考[DistilBert论文](https://arxiv.org/pdf/1910.01108.pdf)\n",
    "\n",
    "首先，通过从`transformers`库中导入`AutoTokenizer`模块，使用`from_pretrained`函数初始化\n",
    "首先，通过从`transformers`库中导入**`AutoTokenizer`模块**，**使用`from_pretrained`函数初始化**\n",
    "\n",
    "&emsp; 此处的`use_fast`表示是否使用`tokenizer`的快速版本；尝试序列化示例数据，检查加载结果\n",
    "\n",
    "&emsp; 需要注意的是，处理后返回的两个键值，`'input_ids'`表示原始文本对应的词素编号序列\n",
    "&emsp; 需要注意的是，处理后返回的两个键值，**`'input_ids'`**表示原始文本对应的词素编号序列\n",
    "\n",
    "&emsp; &emsp; `'attention_mask'`表示自注意力运算时的掩模（标上`0`的部分对应`padding`的内容"
    "&emsp; &emsp; **`'attention_mask'`**表示自注意力运算时的掩模（标上`0`的部分对应`padding`的内容"
   ]
  },
  {
--- a/tutorials/fastnlp_tutorial_e2.ipynb
+++ b/tutorials/fastnlp_tutorial_e2.ipynb
@@ -25,31 +25,53 @@
    "\n",
    "&emsp; &emsp; 将首先简单介绍提示学习模型的研究，以及与`fastNLP v0.8`结合的优势\n",
    "\n",
    "**`prompt`**，**提示词、提词器**，最早出自**`PET`**，\n",
    "**`prompt`**，**提示词**，最早出自论文[Exploiting Cloze Questions for Few Shot TC and NLI](https://arxiv.org/pdf/2001.07676.pdf)中的**`PET`模型**\n",
    "\n",
    "&emsp; \n",
    "&emsp; &emsp; 全称 **`Pattern-Exploiting Training`**，虽然文中并没有提到**`prompt`的说法，但仍视为其开山之作\n",
    "\n",
    "**`prompt-based tuning`**，**基于提示的微调**，描述\n",
    "&emsp; 其大致思路包括，对于文本分类任务，假定输入文本为，后来被称`prompt`，后来被称`verbalizer`，\n",
    "\n",
    "&emsp; **`prompt-based model`**，**基于提示的模型**\n",
    "&emsp; 其主要贡献在于，\n",
    "\n",
    "**`prompt-based model`**，**基于提示的模型**，举例\n",
    "<img src=\"./figures/E2-fig-pet-model.png\" width=\"36%\" height=\"36%\" align=\"center\"></img>\n",
    "\n",
    "&emsp; 案例一：**`P-Tuning v1`**\n",
    "**`prompt-based tuning`**，**基于提示的微调**，\n",
    "\n",
    "&emsp; 案例二：**`PromptTuning`**\n",
    "&emsp; xxxx，更多参考[prompt综述](https://arxiv.org/pdf/2107.13586.pdf)\n",
    "\n",
    "&emsp; 案例三：**`PrefixTuning`**\n",
    "&emsp; &emsp; 以下列举些经典的`prompt-based tuning`案例，简单地介绍下`prompt-based tuning`的脉络\n",
    "\n",
    "&emsp; 案例四：**`SoftPrompt`**\n",
    "&emsp; 案例一：**`P-Tuning v1`**，详细内容参考[P-Tuning-v1论文](https://arxiv.org/pdf/2103.10385.pdf)\n",
    "\n",
    "使用`fastNLP v0.8`实现`prompt-based model`的优势\n",
    "&emsp; &emsp; 其主要贡献在于，\n",
    "\n",
    "&emsp; \n",
    "&emsp; &emsp; 其方法大致包括，\n",
    "\n",
    "&emsp; 本示例仍使用了`tutorial-E1`的`SST2`数据集，将`bert-base-uncased`作为基础模型\n",
    "&emsp; 案例二：**`PromptTuning`**，详细内容参考[PromptTuning论文](https://arxiv.org/pdf/2104.08691.pdf)\n",
    "\n",
    "&emsp; &emsp; 在后续实现中，意图通过将连续的`prompt`与`model`拼接，解决`SST2`二分类任务"
    "&emsp; &emsp; 其主要贡献在于，\n",
    "\n",
    "&emsp; &emsp; 其方法大致包括，\n",
    "\n",
    "&emsp; 案例三：**`PrefixTuning`**，详细内容参考[PrefixTuning论文](https://arxiv.org/pdf/2101.00190.pdf)\n",
    "\n",
    "&emsp; &emsp; 其主要贡献在于，\n",
    "\n",
    "&emsp; &emsp; 其方法大致包括，\n",
    "\n",
    "通过上述介绍可以发现`prompt-based tuning`只是模型微调方式，独立于预训练模型基础`backbone`\n",
    "\n",
    "&emsp; 目前，加载预训练模型的主流方法是使用`transformers`模块，而实现微调的框架则\n",
    "\n",
    "&emsp; &emsp; 可以是`pytorch`、`paddle`、`jittor`等，而不同框架间又存在不兼容的问题\n",
    "\n",
    "&emsp; 因此，**使用`fastNLP v0.8`实现`prompt-based tuning`**，可以**很好地解决`paddle`等框架**\n",
    "\n",
    "&emsp; &emsp; **和`transformers`模块之间的桥接**（`transformers`模块基于`pytorch`实现）\n",
    "\n",
    "本示例仍使用了`tutorial-E1`的`SST2`数据集、`distilbert-base-uncased`模型（便于比较\n",
    "\n",
    "&emsp; 使用`pytorch`框架，通过将连续的`prompt`与`model`拼接，解决`SST2`二分类任务"
   ]
  },
  {
@@ -98,7 +120,7 @@
    "print(transformers.__version__)\n",
    "\n",
    "task = 'sst2'\n",
    "model_checkpoint = 'bert-base-uncased'"
    "model_checkpoint = 'distilbert-base-uncased'  # 'bert-base-uncased'"
   ]
  },
  {
@@ -111,20 +133,32 @@
    "\n",
    "&emsp; &emsp; 以下首先简述`P-Tuning v2`的论文原理，并由此引出`fastNLP v0.8`的代码实践\n",
    "\n",
    "`P-Tuning v2`出自论文 [Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)\n",
    "**`P-Tuning v2`**出自论文[Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)\n",
    "\n",
    "&emsp; 其主要贡献在于，**在`PrefixTuning`等深度提示学习基础上**，**提升了其在分类标注等`NLU`任务的表现**\n",
    "\n",
    "&emsp; &emsp; 并使之在中等规模模型，主要是**参数量在`100M-1B`区间的模型上**，**获得与全参数微调相同的效果**\n",
    "\n",
    "&emsp; 其结构如图所示，通过**在输入序列的分类符`[CLS]`之前**，**加入前缀序列**（**序号对应嵌入是待训练的连续值向量**\n",
    "\n",
    "&emsp; &emsp; **刺激模型在新任务下**，从`[CLS]`对应位置，**输出符合微调任务的输出**，从而达到适应微调任务的目的\n",
    "\n",
    "<img src=\"./figures/E2-fig-p-tuning-v2-model.png\" width=\"60%\" height=\"60%\" align=\"center\"></img>\n",
    "\n",
    "&emsp; 其主要贡献在于，在`PrefixTuning`等深度提示学习基础上，提升了其在分类标注等`NLU`任务的表现\n",
    "本示例使用`bert-base-uncased`模型，作为`P-Tuning v2`的基础`backbone`，设置`requires_grad=False`\n",
    "\n",
    "&emsp; &emsp; 并使之在中等规模模型，主要是参数量在`100M-1B`区间的模型上，获得与全参数微调相同的效果\n",
    "&emsp; &emsp; 固定其参数不参与训练，**设置`pre_seq_len`长的`prefix_tokens`作为输入的提示前缀序列**\n",
    "\n",
    "&emsp; 其结构如图所示，\n",
    "&emsp; **使用基于`nn.Embedding`的`prefix_encoder`为提示前缀嵌入**，通过`get_prompt`函数获取，再将之\n",
    "\n",
    "<img src=\"./figures/E2-fig-p-tuning-v2.png\" width=\"60%\" height=\"60%\" align=\"center\"></img>"
    "&emsp; &emsp; 拼接至批量内每笔数据前得到`inputs_embeds`，同时更新自注意力掩模`attention_mask`\n",
    "\n",
    "&emsp; 将`inputs_embeds`、`attention_mask`和`labels`输入`backbone`，**得到输出包括`loss`和`logits`**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -178,24 +212,24 @@
   "source": [
    "接着，通过确定分类数量初始化模型实例，同时调用`torch.optim.AdamW`模块初始化优化器\n",
    "\n",
    "&emsp; 根据`P-Tuning v2`论文：*Generally, simple classification tasks prefer shorter prompts (less than 20)*\n",
    "&emsp; 根据`P-Tuning v2`论文：*`Generally, simple classification tasks prefer shorter prompts (less than 20)`*\n",
    "\n",
    "&emsp; 此处`pre_seq_len`参数设定为`20`，学习率相应做出调整，其他内容和`tutorial-E1`中的内容一致"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.seq_relationship.bias']\n",
      "- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
      "Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']\n",
      "Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.bias', 'vocab_layer_norm.weight', 'vocab_projector.weight', 'vocab_transform.bias', 'vocab_transform.weight', 'vocab_projector.bias']\n",
      "- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
      "Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.weight', 'classifier.weight', 'pre_classifier.bias', 'classifier.bias']\n",
      "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
     ]
    }
@@ -225,7 +259,7 @@
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
@@ -240,7 +274,7 @@
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b72eeebd34354a88a99b2e07ec9a86df",
       "model_id": "21cbd92c3397497d84dc10f017ec96f4",
       "version_major": 2,
       "version_minor": 0
      },
@@ -262,30 +296,17 @@
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading cached processed dataset at /remote-home/xrliu/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-18ec0e709f05e61e.arrow\n",
      "Loading cached processed dataset at /remote-home/xrliu/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-e2f02ee7442ad73e.arrow\n"
      "Loading cached processed dataset at /remote-home/xrliu/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-294e481a713c5754.arrow\n",
      "Loading cached processed dataset at /remote-home/xrliu/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-ed9d9258aaf0fb54.arrow\n",
      "Loading cached processed dataset at /remote-home/xrliu/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-f44c5576b89f9e6b.arrow\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d15505d825b34f649b719f1ff0d56114",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/2 [00:00<?, ?ba/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
@@ -308,7 +329,7 @@
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -353,7 +374,7 @@
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -369,31 +390,20 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;"
    "最后，使用之前完成的`dataloader_train`和`dataloader_valid`，定义训练模块`trainer`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    }
   ],
   "outputs": [],
   "source": [
    "trainer = Trainer(\n",
    "    model=model,\n",
    "    driver='torch',\n",
    "    device=[0, 1],\n",
    "    n_epochs=20,\n",
    "    device=1,  # [0, 1],\n",
    "    n_epochs=10,\n",
    "    optimizers=optimizers,\n",
    "    train_dataloader=dataloader_train,\n",
    "    evaluate_dataloaders=dataloader_valid,\n",
@@ -405,14 +415,559 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;"
    "&emsp; 使用`trainer.run`方法训练模型，同样每次只对验证集中的`10`个`batch`进行评估"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #7fbfbf; text-decoration-color: #7fbfbf\">[22:53:00] </span><span style=\"color: #000080; text-decoration-color: #000080\">INFO    </span> Running evaluator sanity check for <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> batches.              <a href=\"file://../fastNLP/core/controllers/trainer.py\"><span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">trainer.py</span></a><span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">:</span><a href=\"file://../fastNLP/core/controllers/trainer.py#592\"><span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">592</span></a>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[2;36m[22:53:00]\u001b[0m\u001b[2;36m \u001b[0m\u001b[34mINFO    \u001b[0m Running evaluator sanity check for \u001b[1;36m2\u001b[0m batches.              \u001b]8;id=406635;file://../fastNLP/core/controllers/trainer.py\u001b\\\u001b[2mtrainer.py\u001b[0m\u001b]8;;\u001b\\\u001b[2m:\u001b[0m\u001b]8;id=951504;file://../fastNLP/core/controllers/trainer.py#592\u001b\\\u001b[2m592\u001b[0m\u001b]8;;\u001b\\\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m1\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.540625</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">173.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.540625\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m173.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m2\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.5</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">160.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.5\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m160.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m3\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.509375</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">163.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.509375\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m163.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m4\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.634375</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">203.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.634375\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m203.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m5\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.6125</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">196.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.6125\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m196.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m6\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.675</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">216.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.675\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m216.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m7\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.64375</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">206.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.64375\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m206.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m8\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.665625</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">213.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.665625\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m213.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">----------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "----------------------------- Eval. results on Epoch:\u001b[1;36m9\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.659375</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">211.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.659375\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m211.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">---------------------------- Eval. results on Epoch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>, Batch:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> -----------------------------\n",
       "</pre>\n"
      ],
      "text/plain": [
       "---------------------------- Eval. results on Epoch:\u001b[1;36m10\u001b[0m, Batch:\u001b[1;36m0\u001b[0m -----------------------------\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span>\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"acc#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.696875</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"total#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">320.0</span>,\n",
       "  <span style=\"color: #000080; text-decoration-color: #000080; font-weight: bold\">\"correct#acc\"</span>: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">223.0</span>\n",
       "<span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\n",
       "  \u001b[1;34m\"acc#acc\"\u001b[0m: \u001b[1;36m0.696875\u001b[0m,\n",
       "  \u001b[1;34m\"total#acc\"\u001b[0m: \u001b[1;36m320.0\u001b[0m,\n",
       "  \u001b[1;34m\"correct#acc\"\u001b[0m: \u001b[1;36m223.0\u001b[0m\n",
       "\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "trainer.run(num_eval_batch_per_dl=10)"
   ]
@@ -421,14 +976,55 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;"
    "可以发现，其效果远远逊色于`fine-tuning`，这是因为`P-Tuning v2`虽然能够适应参数量\n",
    "\n",
    "&emsp; 在`100M-1B`区间的模型，但是，**`distilbert-base`的参数量仅为`66M`**，无法触及其下限\n",
    "\n",
    "另一方面，**`fastNLP v0.8`不支持`jupyter`多卡**，所以无法在笔者的电脑/服务器上，完成\n",
    "\n",
    "&emsp; 合适规模模型的学习，例如`110M`的`bert-base`模型，以及`340M`的`bert-large`模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "{'acc#acc': 0.737385, 'total#acc': 872.0, 'correct#acc': 643.0}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trainer.evaluator.run()"
   ]
--- a/tutorials/figures/E2-fig-pet-model.png
+++ b/tutorials/figures/E2-fig-pet-model.png