You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 19 kB

5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366
  1. # MindConverter tutorial
  2. [查看中文](./README_CN.md)
  3. <!-- TOC -->
  4. - [MindConverter tutorial](#mindconverter-tutorial)
  5. - [Overview](#overview)
  6. - [Installation](#installation)
  7. - [Usage](#usage)
  8. - [PyTorch Model Scripts Migration](#pytorch-model-scripts-migration)
  9. - [TensorFlow Model Scripts Migration](#tensorflow-model-scripts-migration)
  10. - [ONNX Model File Migration](#onnx-model-file-migration)
  11. - [Scenario](#scenario)
  12. - [Example](#example)
  13. - [AST-Based Conversion](#ast-based-conversion)
  14. - [Graph-Based Conversion](#graph-based-conversion)
  15. - [TensorFlow Model Scripts Conversion](#tensorflow-model-scripts-conversion)
  16. - [ONNX Model File Conversion](#onnx-model-file-conversion)
  17. - [Caution](#caution)
  18. - [Unsupported situation of AST mode](#unsupported-situation-of-ast-mode)
  19. - [Situation1](#situation1)
  20. - [Situation2](#situation2)
  21. - [Frequently asked questions](#frequently-asked-questions)
  22. - [Appendix](#appendix)
  23. - [Tensorflow Pb Model Exporting](#tensorflow-pb-model-exporting)
  24. - [MindConverter Error Code Definition](#mindconverter-error-code-definition)
  25. <!-- /TOC -->
  26. ## Overview
  27. MindConverter is a migration tool to transform the model scripts and weights from PyTorch, TensorFlow or ONNX to MindSpore. Users can migrate their PyTorch, TensorFlow or ONNX models to MindSpore rapidly with minor changes according to the conversion report.
  28. ## Installation
  29. MindConverter is a submodule in MindInsight. Please follow the [Guide](https://www.mindspore.cn/install/en) here to install MindInsight.
  30. ### Third party Requirements
  31. For users using MindConverter, in addition to install the **TensorFlow** that can satisfy the model loading, inference and training requirements, users also need to install the following third party package (tf2onnx is not required for users that convert ONNX model definition file to MindSpore):
  32. ```text
  33. onnx>=1.8.0
  34. tf2onnx>=1.7.1
  35. onnxruntime>=1.5.2
  36. onnxoptimizer>=0.1.2
  37. ```
  38. For some models, if the onnx or tf2onnx error message appears during the conversion process, please try to upgrade the onnx, tf2onnx or onnxoptimizer in the environment to the latest version.
  39. ## Usage
  40. MindConverter currently only provides command-line interface. Here is the manual page.
  41. ```bash
  42. usage: mindconverter [-h] [--version] [--in_file IN_FILE]
  43. [--model_file MODEL_FILE] [--shape SHAPE [SHAPE ...]]
  44. [--input_nodes INPUT_NODES [INPUT_NODES ...]]
  45. [--output_nodes OUTPUT_NODES [OUTPUT_NODES ...]]
  46. [--output OUTPUT] [--report REPORT]
  47. optional arguments:
  48. -h, --help show this help message and exit
  49. --version show program version number and exit
  50. --in_file IN_FILE Specify path for script file to use AST schema to do
  51. script conversation.
  52. --model_file MODEL_FILE
  53. Tensorflow(.pb) or ONNX(.onnx) model file path is
  54. expected to do script generation based on graph
  55. schema. When `--in_file` and `--model_file` are both
  56. provided, use AST schema as default.
  57. --shape SHAPE [SHAPE ...]
  58. Expected input tensor shape of `--model_file`. It is
  59. required when use graph based schema. Both order and
  60. number should be consistent with `--input_nodes`.
  61. Given that (1,128) and (1,512) are shapes of input_1
  62. and input_2 separately. Usage: --shape 1,128 1,512
  63. --input_nodes INPUT_NODES [INPUT_NODES ...]
  64. Input node(s) name of `--model_file`. It is required
  65. when use graph based schema. Both order and number
  66. should be consistent with `--shape`. Given that both
  67. input_1 and input_2 are inputs of model. Usage:
  68. --input_nodes input_1 input_2
  69. --output_nodes OUTPUT_NODES [OUTPUT_NODES ...]
  70. Output node(s) name of `--model_file`. It is required
  71. when use graph based schema. Given that both output_1
  72. and output_2 are outputs of model. Usage:
  73. --output_nodes output_1 output_2
  74. --output OUTPUT Optional, specify path for converted script file
  75. directory. Default output directory is `output` folder
  76. in the current working directory.
  77. --report REPORT Optional, specify report directory. Default is
  78. converted script directory.
  79. ```
  80. ### PyTorch Model Scripts Migration
  81. #### MindConverter Provides AST for PyTorch:
  82. **Abstract Syntax Tree (AST) based conversion**: Use the argument `--in_file` will enable the AST mode.
  83. > The AST mode will be enabled, if both `--in_file` and `--model_file` are specified.
  84. `--output` and `--report` is optional. MindConverter creates an `output` folder under the current working directory, and outputs generated scripts to it.
  85. > While computational graph based conversion is required, it is recommended to use ONNX file after converting PyTorch model scripts to ONNX file, and the tutorial is [PyTorch instruction](https://pytorch.org/docs/stable/onnx.html).
  86. ### TensorFlow Model Scripts Migration
  87. **MindConverter provides computational graph based conversion for TensorFlow**: Transformation will be done given `--model_file`, `--shape`, `--input_nodes` and `--output_nodes`.
  88. > AST mode is not supported for TensorFlow, only computational graph based mode is available.
  89. If both `output` and `report` are not set, mindConverter creates an `output` folder under the current working directory, and outputs generated scripts, converted checkpoint file, weight map file and conversion reports to it.
  90. ### ONNX Model File Migration
  91. **MindConverter provides computational graph based conversion for ONNX**: Transformation will be done given `--model_file`, `--shape`, `--input_nodes` and `--output_nodes`.
  92. > AST mode is not supported for ONNX, only computational graph based mode is available.
  93. If both `output` and `report` are not set, mindConverter creates an `output` folder under the current working directory, and outputs generated scripts, converted checkpoint file, weight map file and conversion reports to it.
  94. ## Scenario
  95. MindConverter provides two modes for different migration demands.
  96. 1. Keep original scripts' structures, including variables, functions, and libraries.
  97. 2. Keep extra modifications as few as possible, or no modifications are required after conversion.
  98. The AST mode is recommended for the first demand (AST mode is only supported for PyTorch). It parses and analyzes PyTorch scripts, then replace them with the MindSpore AST to generate codes. Theoretically, The AST mode supports any model script. However, the conversion may differ due to the coding style of original scripts.
  99. For the second demand, the Graph mode is recommended. As the computational graph is a standard descriptive language, it is not affected by user's coding style. This mode may have more operators converted as long as these operators are supported by MindConverter.
  100. Some typical image classification networks have been tested for the Graph mode. Note that:
  101. > 1. The Dropout operator will be lost after conversion because the inference mode is used to load the ONNX or TensorFlow model. Manually re-implement is necessary.
  102. > 2. The Graph-based mode will be continuously developed and optimized with further updates.
  103. [Supported models list (Models in below table have been tested based on PyTorch 1.5.0 and TensorFlow 1.15.0, X86 Ubuntu released version)](./docs/supported_model_list.md).
  104. ## Example
  105. ### AST-Based Conversion
  106. Assume the PyTorch script is located at `/home/user/model.py`, and outputs the transformed MindSpore script to `/home/user/output`, with the conversion report to `/home/user/output/report`. Use the following command:
  107. ```bash
  108. mindconverter --in_file /home/user/model.py \
  109. --output /home/user/output \
  110. --report /home/user/output/report
  111. ```
  112. In the conversion report, non-transformed code is listed as follows:
  113. ```text
  114. line <row>:<col> [UnConvert] 'operator' didn't convert. ...
  115. ```
  116. For non-transformed operators, the original code keeps. Please manually migrate them. [Click here](https://www.mindspore.cn/doc/note/en/master/index.html#operator_api) for more information about operator mapping.
  117. Here is an example of the conversion report:
  118. ```text
  119. [Start Convert]
  120. [Insert] 'import mindspore.ops.operations as P' is inserted to the converted file.
  121. line 1:0: [Convert] 'import torch' is converted to 'import mindspore'.
  122. ...
  123. line 157:23: [UnConvert] 'nn.AdaptiveAvgPool2d' didn't convert. Maybe could convert to mindspore.ops.operations.ReduceMean.
  124. ...
  125. [Convert Over]
  126. ```
  127. For non-transformed operators, suggestions are provided in the report. For instance, MindConverter suggests that replace `torch.nn.AdaptiveAvgPool2d` with `mindspore.ops.operations.ReduceMean`.
  128. ### Graph-Based Conversion
  129. #### TensorFlow Model Scripts Conversion
  130. To use TensorFlow model script migration, users need to export TensorFlow model to Pb format first, and obtain the model input node and output node name. For exporting pb model, please refer to [TensorFlow Pb model exporting](#tensorflow-pb-model-exporting).
  131. Suppose the input node name is `input_1:0`, output node name is `predictions/Softmax:0`, the input shape of model is `1,224,224,3` and the original TensorFlow model is at `/home/user/xxx/frozen_model.pb`. Output the transformed MindSpore script and MindSpore checkpoint file to `/home/user/output`, with the conversion report and weight map file to `/home/user/output/report`. Use the following command:
  132. ```bash
  133. mindconverter --model_file /home/user/xxx/frozen_model.pb --shape 1,224,224,3 \
  134. --input_nodes input_1:0 \
  135. --output_nodes predictions/Softmax:0 \
  136. --output /home/user/output \
  137. --report /home/user/output/report
  138. ```
  139. After executed, MindSpore script, MindSpore checkpoint file, weight map file and report file can be found in corresponding directory.
  140. Since the graph based scheme is a generative method, the original TensorFlow script is not referenced in the conversion process. Therefore, the code line and column numbers involved in the generated conversion report refer to the generated script.
  141. In addition, input and output Tensor shape of unconverted operators shows explicitly (`input_shape` and `output_shape`) as comments in converted scripts to help further manual modifications. Here is an example of the `Reshape` operator (Not supported in current version):
  142. ```python
  143. class Classifier(nn.Cell):
  144. def __init__(self):
  145. super(Classifier, self).__init__()
  146. ...
  147. self.reshape = onnx.Reshape(input_shape=(1, 1280, 1, 1),
  148. output_shape=(1, 1280))
  149. ...
  150. def construct(self, x):
  151. ...
  152. # Suppose input of `reshape` is x.
  153. reshape_output = self.reshape(x)
  154. ...
  155. ```
  156. It is convenient to replace the operators according to the `input_shape` and `output_shape` parameters. The replacement is like this:
  157. ```python
  158. from mindspore.ops import operations as P
  159. ...
  160. class Classifier(nn.Cell):
  161. def __init__(self):
  162. super(Classifier, self).__init__()
  163. ...
  164. self.reshape = P.Reshape(input_shape=(1, 1280, 1, 1),
  165. output_shape=(1, 1280))
  166. ...
  167. def construct(self, x):
  168. ...
  169. # Suppose input of `reshape` is x.
  170. reshape_output = self.reshape(x, (1, 1280))
  171. ...
  172. ```
  173. > `--output` and `--report` are optional. MindConverter creates an `output` folder under the current working directory, and outputs generated scripts, MindSpore checkpoint file, weight map file and conversion reports to it.
  174. Here is an example of the weight map:
  175. ```json
  176. {
  177. "resnet50": [
  178. {
  179. "converted_weight": {
  180. "name": "conv2d_0.weight",
  181. "shape": [
  182. 64,
  183. 3,
  184. 7,
  185. 7
  186. ],
  187. "data_type": "Float32"
  188. },
  189. "source_weight": {
  190. "name": "conv1.weight",
  191. "shape": [
  192. 64,
  193. 3,
  194. 7,
  195. 7
  196. ],
  197. "data_type": "float32"
  198. }
  199. }
  200. ]
  201. }
  202. ```
  203. Weight information in MindSpore (`converted_weight`) and that in source framework(`source_weight`) are saved in weight map separately.
  204. #### ONNX Model File Conversion
  205. To use ONNX model file migration, user needs to obtain the model input node and output node name from ONNX model. To get input node and output node name, [Netron](https://github.com/lutzroeder/netron) is recommended.
  206. Suppose the model is saved to `/home/user/xxx/model.onnx`, corresponding input node name is `input_1:0`, output node name is `predictions/Softmax:0`, the input shape of model is `1,3,224,224`, the following command can be used to generate the script:
  207. ```bash
  208. mindconverter --model_file /home/user/xxx/model.onnx --shape 1,3,224,224 \
  209. --input_nodes input_1:0 \
  210. --output_nodes predictions/Softmax:0 \
  211. --output /home/user/output \
  212. --report /home/user/output/report
  213. ```
  214. After executed, MindSpore script, MindSpore checkpoint file, weight map file and report file can be found in corresponding directory.
  215. Since the graph based scheme is a generative method, the original ONNX model is not referenced in the conversion process. Therefore, the code line and column numbers involved in the generated conversion report refer to the generated script.
  216. In addition, for operators that are not converted successfully, the input and output shape of tensor of the node will be identified in the code by `input_shape` and `output_shape`. For example, please refer to the example in **TensorFlow Model Scripts Conversion** section.
  217. ## Caution
  218. 1. TensorFlow are not an explicitly stated dependency libraries in MindInsight. The Graph conversion requires the consistent TensorFlow version as the model is trained.
  219. 2. This script conversion tool relies on operators which supported by ONNX and MindSpore. Unsupported operators may not be successfully mapped to MindSpore operators. You can manually edit, or implement the mapping based on MindConverter, and contribute to our MindInsight repository. We appreciate your support for the MindSpore community.
  220. 3. MindConverter converts dynamic input shape to constant one based on `--shape` while using graph based scheme, as a result, it is required that inputs shape used to retrain or inference in MindSpore are the same as that used to convert using MindConverter. If inputs shape has changed, rerunning MindConverter with new `--shape` or fixing shape related parameters in old script manually is necessary.
  221. 4. MindSpore script and MindSpore checkpoint file are saved in the same file folder path, while report file and weight map are saved in the same one.
  222. 5. The security and consistency of the model file should be guaranteed by the user.
  223. ## Unsupported situation of AST mode
  224. ### Situation1
  225. Classes and functions that can't be converted:
  226. 1. The use of `.shape`, `.ndim` and `.dtype` member of `torch.Tensor`.
  227. 2. `torch.nn.AdaptiveXXXPoolXd` and `torch.nn.functional.adaptive_XXX_poolXd()`.
  228. 3. `torch.nn.functional.Dropout`.
  229. 4. `torch.unsqueeze()` and `torch.Tensor.unsqueeze()`.
  230. 5. `torch.chunk()` and `torch.Tensor.chunk()`.
  231. ### Situation2
  232. Subclassing from the subclasses of nn.Module
  233. e.g. (code snip from torchvision.models.mobilenet)
  234. ```python
  235. from torch import nn
  236. class ConvBNReLU(nn.Sequential):
  237. def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
  238. padding = (kernel_size - 1) // 2
  239. super(ConvBNReLU, self).__init__(
  240. nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
  241. nn.BatchNorm2d(out_planes),
  242. nn.ReLU6(inplace=True)
  243. )
  244. ```
  245. ## Frequently asked questions
  246. Q1. `terminate called after throwing an instance of 'std::system_error', what(): Resource temporarily unavailable, Aborted (core dumped)`:
  247. > Answer: This problem is caused by TsorFlow. First step of conversion process is loading TensorFlow model into memory using TensorFlow module, and TensorFlow starts to apply for needed resource. When required resource is unavailable, such as exceeding max process number of Linux system limit, etc., TensorFlow will raise an error from its C/C++ layer. For more detail, please refer to TensorFlow official repository. There are some known issue for reference only:
  248. [TF ISSUE 14885](https://github.com/tensorflow/tensorflow/issues/14885), [TF ISSUE 37449](https://github.com/tensorflow/tensorflow/issues/37449)
  249. Q2. Can MindConverter run on ARM platform?
  250. > Answer: MindConverter supports both x86 and ARM platform. Please ensure all required dependencies and environments installed in the ARM platform.
  251. Q3. Why does the conversion process take a lot of time (more than 10 minutes), but the model is not so large?
  252. > Answer: When converting, MindConverter needs to use protobuf to deserialize the model file. Please make sure that the protobuf installed in Python environment is implemented by C++ backend. The validation method is as follows. If the output is "python", you need to install Python protobuf implemented by C++ (download the protobuf source code, enter the "python" subdirectory in the source code, and use `python setup.py install --cpp_implementation` to install). If the output is "cpp" and the conversion process still takes a long time, please add environment variable `export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp` before conversion.
  253. ```python
  254. from google.protobuf.internal import api_implementation
  255. print(api_implementation.Type())
  256. ```
  257. Q4. While converting .pb file to MindSpore script, what may be the cause of error code 1000001 with ensuring `model_file`, `shape`, `iput_nodes` and `output_nodes` set right and third party requirements installed correctly?
  258. > Answer: Make sure that the TensorFlow version to generate .pb file is no higher than that to convert .pb file, avoiding the conflict which caused by using low version TensorFlow to parse .pb file generated by high version one.
  259. Q5. What should I do to deal with Exception `[ERROR] MINDCONVERTER: [BaseConverterError] code: 0000000, msg: {python_home}/lib/libgomp.so.1: cannot allocate memory in static TLS block`?
  260. > Answer: In most cases, the problem is caused by environment variable exported incorrectly. Please set `export LD_PRELOAD={python_home}/lib/libgomp.so.1.0.0`, then try to rerun MindConverter.
  261. ## Appendix
  262. ### TensorFlow Pb model exporting
  263. If build model with Keras API, user can refer to this [tutorial](./docs/tensorflow_model_exporting.md).
  264. ### MindConverter Error Code Definition
  265. Error code defined in MindConverter, please refer to [LINK](./docs/error_code_definition.md).