|
|
|
@@ -65,8 +65,8 @@ Figure 2 displays the Step Trace page. The Step Trace detail will show the start |
|
|
|
can also choose a specific step to see its step trace statistics. The graphs at the bottom of the page show how the execution time of Step Gap, Forward/Backward Propagation and |
|
|
|
Step Tail changes according to different steps, it will help to decide whether we can optimize the performance of some stages. |
|
|
|
|
|
|
|
*Notice:* MindSpore choose the Foward Start/Backward End Operators automatically, The names of the two operators are shown on the page. It is possible that the two operators are |
|
|
|
not choosen as what the user expect. Users can choose the operators from the dumped execution graph, and specify the two operators manually by setting the `FP_POINT` and `BP_POINT` environment. |
|
|
|
*Notice:* MindSpore choose the Foward Start/Backward End Operators automatically, The names of the two operators are shown on the page. Profiler do not guarantee that the two operators are |
|
|
|
always chosen as the user's expectation. Users can choose the two operators according to the execution graph, and specify the them manually by setting the `FP_POINT` and `BP_POINT` environment variables. |
|
|
|
For example: `export FP_POINT=fp32_vars/conv2d/conv2Dfp32_vars/BatchNorm/FusedBatchNorm_Reduce` and `export BP_POINT=loss_scale/gradients/AddN_70`. |
|
|
|
|
|
|
|
### Operator Performance Analysis |
|
|
|
@@ -160,3 +160,5 @@ The Profiler has the following limitations now: |
|
|
|
|
|
|
|
* Only programs running on Ascend chip is supported. |
|
|
|
* To limit the data size generated by the Profiler, MindInsight suggests that for large neural network, the profiled steps should better below 10. |
|
|
|
* The parse of Timeline data is time consuming, and several step's data is usually enough for analysis. In order to speed up the data parse and UI |
|
|
|
display, Profiler will show at most 20M data (Contain 10+ step information for large networks). |