|
|
|
@@ -2,37 +2,69 @@ |
|
|
|
"cells": [ |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "headed-output", |
|
|
|
"id": "military-possible", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"# PyTorch BERT迁移案例\n", |
|
|
|
"PyTorch模型转换为MindSpore脚本+权重,首先需要将PyTorch模型导出为ONNX模型,然后使用MindConverter CLI工具进行脚本+权重迁移。\n", |
|
|
|
"`Linux` `Ascend` `GPU` `CPU` `模型迁移` `初级` `中级` `高级`\n", |
|
|
|
"\n", |
|
|
|
"[](https://gitee.com/mindspore/docs/blob/master/docs/migration_guide/source_zh_cn/torch_bert_migration_case_of_mindconverter.ipynb)" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "modular-arbitration", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 概述" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "stupid-british", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"PyTorch模型转换为MindSpore脚本和权重,首先需要将PyTorch模型导出为ONNX模型,然后使用MindConverter CLI工具进行脚本和权重迁移。\n", |
|
|
|
"HuggingFace Transformers是PyTorch框架下主流的自然语言处理三方库,我们以Transformer中的BertForMaskedLM为例,演示迁移过程。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "sustained-touch", |
|
|
|
"id": "impossible-nebraska", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 环境准备\n", |
|
|
|
"\n", |
|
|
|
"本案例需安装以下Python三方库:\n", |
|
|
|
"```bash\n", |
|
|
|
"pip install torch==1.5.1\n", |
|
|
|
"pip install transformer==4.2.2\n", |
|
|
|
"pip install mindspore==1.2.0\n", |
|
|
|
"pip install mindinsight==1.2.0\n", |
|
|
|
"```" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "revolutionary-bench", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 1. ONNX模型导出\n", |
|
|
|
"## ONNX模型导出\n", |
|
|
|
"\n", |
|
|
|
"首先实例化HuggingFace中的BertForMaskedLM,以及相应的分词器(首次使用需要下降模型权重、词表、模型配置等数据)。\n", |
|
|
|
"首先实例化HuggingFace中的BertForMaskedLM,以及相应的分词器(首次使用时需要下载模型权重、词表、模型配置等数据)。\n", |
|
|
|
"\n", |
|
|
|
"关于HuggingFace的使用,本文不做过多介绍,详细使用请参考[HuggingFace使用文档](https://huggingface.co/transformers/model_doc)。\n", |
|
|
|
"关于HuggingFace的使用,本文不做过多介绍,详细使用请参考[HuggingFace使用文档](https://huggingface.co/transformers/model_doc/bert.html)。\n", |
|
|
|
"\n", |
|
|
|
"该模型可对句子中被掩蔽(mask)的词进行预测。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": 35, |
|
|
|
"id": "interpreted-trunk", |
|
|
|
"execution_count": 1, |
|
|
|
"id": "heated-millennium", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"source": [ |
|
|
|
"import numpy as np\n", |
|
|
|
"import torch\n", |
|
|
|
"from transformers.models.bert import BertForMaskedLM, BertTokenizer\n", |
|
|
|
"\n", |
|
|
|
"tokenizer = BertTokenizer.from_pretrained(\"bert-base-uncased\")\n", |
|
|
|
@@ -41,7 +73,7 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "bronze-authentication", |
|
|
|
"id": "bacterial-picking", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"我们使用该模型进行推理,生成若干组测试用例,以验证模型迁移的正确性。\n", |
|
|
|
@@ -53,8 +85,8 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": 37, |
|
|
|
"id": "legendary-seven", |
|
|
|
"execution_count": 2, |
|
|
|
"id": "hawaiian-borough", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
@@ -72,6 +104,9 @@ |
|
|
|
} |
|
|
|
], |
|
|
|
"source": [ |
|
|
|
"import numpy as np\n", |
|
|
|
"import torch\n", |
|
|
|
"\n", |
|
|
|
"text = \"china is a poworful country, its capital is [MASK].\"\n", |
|
|
|
"tokenized_sentence = tokenizer(text)\n", |
|
|
|
"\n", |
|
|
|
@@ -100,7 +135,7 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "opponent-validity", |
|
|
|
"id": "atomic-rebel", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"HuggingFace提供了导出ONNX模型的工具,可使用如下方法将HuggingFace的预训练模型导出为ONNX模型:" |
|
|
|
@@ -108,41 +143,57 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "ethical-radiation", |
|
|
|
"execution_count": 3, |
|
|
|
"id": "corresponding-vampire", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
"name": "stdout", |
|
|
|
"output_type": "stream", |
|
|
|
"text": [ |
|
|
|
"Creating folder exported_bert_base_uncased\n", |
|
|
|
"Using framework PyTorch: 1.5.1+cu101\n", |
|
|
|
"Found input input_ids with shape: {0: 'batch', 1: 'sequence'}\n", |
|
|
|
"Found input token_type_ids with shape: {0: 'batch', 1: 'sequence'}\n", |
|
|
|
"Found input attention_mask with shape: {0: 'batch', 1: 'sequence'}\n", |
|
|
|
"Found output output_0 with shape: {0: 'batch', 1: 'sequence'}\n", |
|
|
|
"Ensuring inputs are in correct order\n", |
|
|
|
"position_ids is not present in the generated input list.\n", |
|
|
|
"Generated inputs order: ['input_ids', 'attention_mask', 'token_type_ids']\n" |
|
|
|
] |
|
|
|
} |
|
|
|
], |
|
|
|
"source": [ |
|
|
|
"from pathlib import Path\n", |
|
|
|
"from transformers.convert_graph_to_onnx import convert\n", |
|
|
|
"\n", |
|
|
|
"\n", |
|
|
|
"# Exported onnx model path.\n", |
|
|
|
"saved_onnx_path = \"./exported_bert_base_uncased/bert_base_uncased.onnx\"\n", |
|
|
|
"convert(\"pt\", model, Path(saved_onnx_path), 11, tokenizer)" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "southeast-response", |
|
|
|
"id": "adverse-outline", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"根据打印的信息,我们可以看到导出的ONNX模型输入节点有3个:`input_ids`,`token_type_ids`,`attention_mask`,以及相应的输入轴,\n", |
|
|
|
"输出节点有一个`output_0`。\n", |
|
|
|
"\n", |
|
|
|
"至此ONNX模型导出成功,接下来对导出的ONNX模型精度进行验证。" |
|
|
|
"至此ONNX模型导出成功,接下来对导出的ONNX模型精度进行验证(ONNX模型导出过程在ARM机器上执行,可能需要用户自行编译安装PyTorch以及Transformers三方库)。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "historic-business", |
|
|
|
"id": "paperback-playback", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 2. ONNX模型验证\n" |
|
|
|
"## ONNX模型验证\n" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "naval-virgin", |
|
|
|
"id": "mysterious-courage", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"我们仍然使用PyTorch模型推理时的句子`china is a poworful country, its capital is [MASK].`作为输入,观测ONNX模型表现是否符合预期。" |
|
|
|
@@ -150,8 +201,8 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": 39, |
|
|
|
"id": "satisfactory-embassy", |
|
|
|
"execution_count": 4, |
|
|
|
"id": "suitable-channels", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
@@ -185,23 +236,23 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "legal-consensus", |
|
|
|
"id": "essential-pharmacology", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"可以看到,导出的ONNX模型功能与原PyTorch模型完全一致,接下来可以使用MindConverter进行脚本+权重迁移了!" |
|
|
|
"可以看到,导出的ONNX模型功能与原PyTorch模型完全一致,接下来可以使用MindConverter进行脚本和权重迁移了!" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "adverse-coverage", |
|
|
|
"id": "realistic-singapore", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 3. MindConverter进行模型脚本+权重迁移" |
|
|
|
"## MindConverter进行模型脚本和权重迁移" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "vanilla-nature", |
|
|
|
"id": "invisible-tracker", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"MindConverter进行模型转换时,需要给定模型路径(`--model_file`)、输入节点(`--input_nodes`)、输入节点尺寸(`--shape`)、输出节点(`--output_nodes`)。\n", |
|
|
|
@@ -211,31 +262,39 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "metallic-wright", |
|
|
|
"execution_count": 5, |
|
|
|
"id": "processed-spanish", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
"name": "stdout", |
|
|
|
"output_type": "stream", |
|
|
|
"text": [ |
|
|
|
"\n", |
|
|
|
"MindConverter: conversion is completed.\n", |
|
|
|
"\n" |
|
|
|
] |
|
|
|
} |
|
|
|
], |
|
|
|
"source": [ |
|
|
|
"!mindconverter --help" |
|
|
|
"!mindconverter --model_file ./exported_bert_base_uncased/bert_base_uncased.onnx --shape 1,128 1,128 1,128 \\\n", |
|
|
|
" --input_nodes input_ids token_type_ids attention_mask \\\n", |
|
|
|
" --output_nodes output_0 \\\n", |
|
|
|
" --output ./converted_bert_base_uncased \\\n", |
|
|
|
" --report ./converted_bert_base_uncased" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "horizontal-heater", |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "working-funeral", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"source": [ |
|
|
|
"!mindconverter --model_file ./exported_bert_base_uncased/bert_base_uncased.onnx --shape 1,128 1,128 1,128 \\\n", |
|
|
|
" --input_nodes input_ids,token_type_ids,attention_mask\n", |
|
|
|
" --output_nodes output_0\n", |
|
|
|
" --output ./converted_bert_base_uncased\n", |
|
|
|
" --report ./converted_bert_base_uncased" |
|
|
|
"**看到“MindConverter: conversion is completed.”即代表模型已成功转换!**" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "blind-forty", |
|
|
|
"id": "classical-seminar", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"转换完成后,该目录下生成如下文件:\n", |
|
|
|
@@ -249,17 +308,26 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "blocked-teens", |
|
|
|
"execution_count": 6, |
|
|
|
"id": "equipped-bottom", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
"name": "stdout", |
|
|
|
"output_type": "stream", |
|
|
|
"text": [ |
|
|
|
"bert_base_uncased.ckpt\treport_of_bert_base_uncased.txt\r\n", |
|
|
|
"bert_base_uncased.py\tweight_map_of_bert_base_uncased.json\r\n" |
|
|
|
] |
|
|
|
} |
|
|
|
], |
|
|
|
"source": [ |
|
|
|
"!ls ./converted_bert_base_uncased" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "improving-difference", |
|
|
|
"id": "fuzzy-thinking", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"可以看到所有文件已生成。\n", |
|
|
|
@@ -269,16 +337,16 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "dimensional-driver", |
|
|
|
"id": "leading-punch", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 4. MindSpore模型验证\n", |
|
|
|
"## MindSpore模型验证\n", |
|
|
|
"我们仍然使用`china is a poworful country, its capital is [MASK].`作为输入,观测迁移后模型表现是否符合预期。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "unexpected-permit", |
|
|
|
"id": "competent-dispute", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"由于工具在转换时,需要将模型尺寸冻结,因此在使用MindSpore进行推理验证时,需要将句子补齐(Pad)到固定长度,可通过如下函数实现句子补齐。\n", |
|
|
|
@@ -288,8 +356,8 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "going-fields", |
|
|
|
"execution_count": 7, |
|
|
|
"id": "essential-football", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"source": [ |
|
|
|
@@ -304,10 +372,18 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"execution_count": null, |
|
|
|
"id": "intense-carrier", |
|
|
|
"execution_count": 8, |
|
|
|
"id": "greatest-louis", |
|
|
|
"metadata": {}, |
|
|
|
"outputs": [], |
|
|
|
"outputs": [ |
|
|
|
{ |
|
|
|
"name": "stdout", |
|
|
|
"output_type": "stream", |
|
|
|
"text": [ |
|
|
|
"ONNX Pred id: 7211\n" |
|
|
|
] |
|
|
|
} |
|
|
|
], |
|
|
|
"source": [ |
|
|
|
"from converted_bert_base_uncased.bert_base_uncased import Model as MsBert\n", |
|
|
|
"from mindspore import load_checkpoint, load_param_into_net, context, Tensor\n", |
|
|
|
@@ -336,50 +412,40 @@ |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "national-norfolk", |
|
|
|
"id": "hybrid-intranet", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"至此,使用MindConverter进行脚本+权重迁移完成。\n", |
|
|
|
"至此,使用MindConverter进行脚本和权重迁移完成。\n", |
|
|
|
"\n", |
|
|
|
"用户可根据使用场景编写训练、推理、部署脚本,实现个人业务逻辑。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "capital-joint", |
|
|
|
"id": "minute-sector", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 5. 其他问题" |
|
|
|
"## 常见问题" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "magnetic-collective", |
|
|
|
"id": "favorite-worse", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"1. 如何修改迁移后脚本的批次大小(Batch size)、句子长度(Sequence length),实现模型可支持任意的尺寸的数据推理、训练?\n", |
|
|
|
"**Q:如何修改迁移后脚本的批次大小(Batch size)、句子长度(Sequence length)等尺寸(shape)规格,以实现模型可支持任意尺寸的数据推理、训练?**\n", |
|
|
|
"\n", |
|
|
|
"> 答:迁移后脚本存在shape限制,通常是由于Reshape算子导致,或其他涉及张量排布变化的算子导致。以上述Bert迁移为例,首先创建两个全局变量,作为预期的批次大小、句子长度的表示,而后将Reshape操作的目标尺寸进行修改,相应的替换成批次大小、句子长度的全局变量即可。\n", |
|
|
|
"> ./converted_bert_base_uncased/modified_bert_base_uncased.py为修改后的可支持任意尺寸数据训练、推理的脚本,该脚本脚本展示了相应的修改。" |
|
|
|
"A:迁移后脚本存在shape限制,通常是由于Reshape算子导致,或其他涉及张量排布变化的算子导致。以上述Bert迁移为例,首先创建两个全局变量,表示预期的批次大小、句子长度,而后修改Reshape操作的目标尺寸,替换成相应的批次大小、句子长度的全局变量即可。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "proprietary-yugoslavia", |
|
|
|
"id": "failing-smoke", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"2. 生成后的脚本中类名的定义不符合开发者的习惯,如`class Module0(nn.Cell)`,人工修改是否会影响转换后的权重加载?\n", |
|
|
|
"**Q:生成后的脚本中类名的定义不符合开发者的习惯,如`class Module0(nn.Cell)`,人工修改是否会影响转换后的权重加载?**\n", |
|
|
|
"\n", |
|
|
|
"> 答:权重的加载仅与变量名、类结构有关,因此类名可以修改,不影响权重加载。若需要调整类的结构,则相应的权重命名需要同步修改以适应迁移后模型的结构。" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"id": "selective-flight", |
|
|
|
"metadata": {}, |
|
|
|
"source": [ |
|
|
|
"## 6. 其他参考教程\n", |
|
|
|
"1. [MindConverter高阶使用教程-自定义生成代码结构]()" |
|
|
|
"A:权重的加载仅与变量名、类结构有关,因此类名可以修改,不影响权重加载。若需要调整类的结构,则相应的权重命名需要同步修改以适应迁移后模型的结构。" |
|
|
|
] |
|
|
|
} |
|
|
|
], |
|
|
|
@@ -404,4 +470,4 @@ |
|
|
|
}, |
|
|
|
"nbformat": 4, |
|
|
|
"nbformat_minor": 5 |
|
|
|
} |
|
|
|
} |