Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
|
5 years ago | |
|---|---|---|
| .. | ||
| analyser | 5 years ago | |
| common | 5 years ago | |
| images | 5 years ago | |
| parser | 5 years ago | |
| proposer | 5 years ago | |
| README.md | 5 years ago | |
| __init__.py | 5 years ago | |
| profiling.py | 5 years ago | |
MindInsight Profiler is a performance analysis tool for MindSpore. It can help to analyse and optimize the performance of the neural networks.
The Profiler enables users to:
To enable profiling on MindSpore, the MindInsight Profiler apis should be added to the script:
Import MindInsight Profiler
from mindinsight.profiler import Profiler
Initialize the Profiler after set context, and before the network initialization.
Example:
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=int(os.environ["DEVICE_ID"]))
profiler = Profiler(output_path="./data", is_detail=True, is_show_op_path=False, subgraph='all')
net = Net()
Parameters of Profiler including:
subgraph (str): Defines which subgraph to monitor and analyse, can be 'all', 'Default', 'Gradients'.
is_detail (bool): Whether to show profiling data for op_instance level, only show optype level if False.
is_show_op_path (bool): Whether to save the full path for each op instance.
output_path (str): Output data path.
optypes_to_deal (list): Op type names, the data of which optype should be collected and analysed,
will deal with all op if null.
optypes_not_deal (list): Op type names, the data of which optype will not be collected and analysed.
Call Profiler.analyse() at the end of the program
Profiler.analyse() will collect profiling data and generate the analysis results.
After training, we can open MindInsight UI to analyse the performance.
Users can access the Performance Profiler by selecting a specific training from the training list, and click the performance profiling link.
Figure 1: Overall Performance
Figure 1 displays the overall performance of the training, including the overall data of Step Trace, Operator Performance, MindData Performance and Timeline.
Users can click the detail link to see the details of each components. Besides, MindInsight Profiler will try to analyse the performance data, the assistant on the left
will show performance tuning suggestions for this training.
The Step Trace Component is used to show the general performance of the stages in the training. Step Trace will divide the training into several stages:
Step Gap, Forward/Backward Propagation, All Reduce and Parameter Update. It will show the execution time for each stage, and help to find the bottleneck
stage quickly.
Figure 2: Step Trace Analysis
Figure 2 displays the Step Trace page. The Step Trace detail will show the start/finish time for each stage. By default, it shows the average time for all the steps. Users
can also choose a specific step to see its step trace statistics. The graphs at the bottom of the page show how the execution time of Step Gap, Forward/Backward Propagation and
Step Tail changes according to different steps, it will help to decide whether we can optimize the performance of some stages.
Notice: MindSpore choose the Forward Start/Backward End Operators automatically, The names of the two operators are shown on the page. Profiler do not guarantee that the two operators are
always chosen as the user's expectation. Users can choose the two operators according to the execution graph, and specify the them manually by setting the FP_POINT and BP_POINT environment variables.
For example: export FP_POINT=fp32_vars/conv2d/conv2Dfp32_vars/BatchNorm/FusedBatchNorm_Reduce and export BP_POINT=loss_scale/gradients/AddN_70.
The operator performance analysis component is used to display the execution time of the operators during MindSpore run.
Figure 3: Statistics for Operator Types
Figure 3 displays the statistics for the operator types, including:
Figure 4: Statistics for Operators
Figure 4 displays the statistics table for the operators, including:
The MindData performance analysis component is used to analyse the execution of data input pipeline for the training. The data input pipeline can be divided into three stages:
the data process pipeline, data transfer from host to device and data fetch on device. The component will analyse the performance of each stage for detail and display the results.
Figure 5: MindData Performance Analysis
Figure 5 displays the page of MindData performance analysis component. It consists of two tabs: The step gap and the data process.
The step gap page is used to analyse whether there is performance bottleneck in the three stages. We can get our conclusion from the data queue graphs:
Figure 6: Data Process Pipeline Analysis
Figure 6 displays the page of data process pipeline analysis. The data queues are used to exchange data between the MindData operators. The data size of the queues reflect the
data consume speed of the operators, and can be used to infer the bottleneck operator. The queue usage percentage stands for the average value of data size in queue divide data queue maximum size, the higher
the usage percentage, the more data that is accumulated in the queue. The graph at the bottom of the page shows the MindData pipeline operators with the data queues, the user can click one queue to see how
the data size changes according to the time, and the operators connected to the queue. The data process pipeline can be analysed as follows:
To optimize the perforamnce of MindData operators, there are some suggestions:
Dataset Operator is the bottleneck, try to increase the num_parallel_workers;GeneratorOp type operator is the bottleneck, try to increase the num_parallel_workers and replace the operator to MindRecordDataset;MapOp type operator is the bottleneck, try to increase the num_parallel_workers; If it is a python operator, try to optimize the training script;BatchOp type operator is the bottleneck, try to adjust the size of prefetch_size.The Timeline component can display:
How to view the timeline:
To view the detailed information of the timeline, you can click the "Download" button to save the file with the timeline information locally, and then view it through the tool.
We recommend you to use Google plugin: chrome://tracing, or Perfetto tool: https://ui.perfetto.dev/#!viewer.
Users can get the most detailed information from the Timeline:
Figure 7 Timeline Analysis
The Timeline consists of the following parts:
W/A/S/D can be applied to zoom in and out of the timeline graph.
The Profiler has the following limitations now:
MindInsight为MindSpore提供了简单易用的调优调试能力。在训练过程中,可以将标量、张量、图像、计算图、模型超参、训练耗时等数据记录到文件中,通过MindInsight可视化页面进行查看及分析。
SVG Text Python Vue CSV other