From: @zlq2020 Reviewed-by: @guoqi1024 Signed-off-by: @guoqi1024tags/v1.2.0-rc1
| @@ -20,7 +20,6 @@ | |||
| # [MobileNetV2 Description](#contents) | |||
| MobileNetV2 is tuned to mobile phone CPUs through a combination of hardware- aware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances.Nov 20, 2019. | |||
| [Paper](https://arxiv.org/pdf/1905.02244) Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. "Searching for MobileNetV2." In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324. 2019. | |||
| @@ -38,11 +37,10 @@ The overall network architecture of MobileNetV2 is show below: | |||
| Dataset used: [imagenet](http://www.image-net.org/) | |||
| - Dataset size: ~125G, 1.2W colorful images in 1000 classes | |||
| - Train: 120G, 1.2W images | |||
| - Test: 5G, 50000 images | |||
| - Train: 120G, 1.2W images | |||
| - Test: 5G, 50000 images | |||
| - Data format: RGB images. | |||
| - Note: Data will be processed in src/dataset.py | |||
| - Note: Data will be processed in src/dataset.py | |||
| # [Features](#contents) | |||
| @@ -54,14 +52,13 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil | |||
| # [Environment Requirements](#contents) | |||
| - Hardware:Ascend | |||
| - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Framework | |||
| - [MindSpore](https://www.mindspore.cn/install/en) | |||
| - For more information, please check the resources below | |||
| - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) | |||
| - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) | |||
| # [Script description](#contents) | |||
| ## [Script and sample code](#contents) | |||
| @@ -84,7 +81,6 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil | |||
| ├── export.py # export checkpoint files into air/onnx | |||
| ``` | |||
| ## [Script Parameters](#contents) | |||
| Parameters for both training and evaluation can be set in config.py | |||
| @@ -113,13 +109,11 @@ Parameters for both training and evaluation can be set in config.py | |||
| ### Usage | |||
| You can start training using python or shell scripts. The usage of shell scripts as follows: | |||
| - bash run_train.sh [Ascend] [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional) | |||
| - bash run_train.sh [GPU] [DEVICE_ID_LIST] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional) | |||
| ### Launch | |||
| ``` bash | |||
| @@ -133,7 +127,7 @@ You can start training using python or shell scripts. The usage of shell scripts | |||
| Training result will be stored in the example path. Checkpoints trained by `Ascend` will be stored at `./train/device$i/checkpoint` by default, and training log will be redirected to `./train/device$i/train.log`. Checkpoints trained by `GPU` will be stored in `./train/checkpointckpt_$i` by default, and training log will be redirected to `./train/train.log`. | |||
| `train.log` is as follows: | |||
| ``` | |||
| ``` bash | |||
| epoch: [ 0/200], step:[ 624/ 625], loss:[5.258/5.258], time:[140412.236], lr:[0.100] | |||
| epoch time: 140522.500, per step time: 224.836, avg loss: 5.258 | |||
| epoch: [ 1/200], step:[ 624/ 625], loss:[3.917/3.917], time:[138221.250], lr:[0.200] | |||
| @@ -150,7 +144,7 @@ You can start training using python or shell scripts. The usage of shell scripts | |||
| ### Launch | |||
| ``` | |||
| ``` bash | |||
| # infer example | |||
| shell: | |||
| Ascend: sh run_infer_quant.sh Ascend ~/imagenet/val/ ~/train/mobilenet-60_1601.ckpt | |||
| @@ -160,9 +154,9 @@ You can start training using python or shell scripts. The usage of shell scripts | |||
| ### Result | |||
| Inference result will be stored in the example path, you can find result like the followings in `./val/infer.log`. | |||
| Inference result will be stored in the example path, you can find result like the following in `./val/infer.log`. | |||
| ``` | |||
| ``` bash | |||
| result: {'acc': 0.71976314102564111} | |||
| ``` | |||
| @@ -35,7 +35,7 @@ def parse_args(): | |||
| >>> parse_args() | |||
| """ | |||
| parser = ArgumentParser(description="mindspore distributed training launch " | |||
| "helper utilty that will spawn up " | |||
| "helper utility that will spawn up " | |||
| "multiple distributed processes") | |||
| parser.add_argument("--nproc_per_node", type=int, default=1, | |||
| help="The number of processes to launch on each node, " | |||
| @@ -37,19 +37,18 @@ The overall network architecture of Resnet50 is show below: | |||
| Dataset used: [ImageNet2012](http://www.image-net.org/) | |||
| - Dataset size 224*224 colorful images in 1000 classes | |||
| - Train:1,281,167 images | |||
| - Test: 50,000 images | |||
| - Train:1,281,167 images | |||
| - Test: 50,000 images | |||
| - Data format:jpeg | |||
| - Note:Data will be processed in dataset.py | |||
| - Note:Data will be processed in dataset.py | |||
| - Download the dataset, the directory structure is as follows: | |||
| ``` | |||
| ```python | |||
| └─dataset | |||
| ├─ilsvrc # train dataset | |||
| └─validation_preprocess # evaluate dataset | |||
| ``` | |||
| # [Features](#contents) | |||
| ## [Mixed Precision](#contents) | |||
| @@ -60,14 +59,13 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil | |||
| # [Environment Requirements](#contents) | |||
| - Hardware:Ascend | |||
| - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Framework | |||
| - [MindSpore](https://www.mindspore.cn/install/en) | |||
| - For more information, please check the resources below: | |||
| - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) | |||
| - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) | |||
| # [Script description](#contents) | |||
| ## [Script and sample code](#contents) | |||
| @@ -124,18 +122,19 @@ Parameters for both training and evaluation can be set in config.py | |||
| ### Usage | |||
| - Ascend: sh run_train.sh Ascend [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional) | |||
| ### Launch | |||
| ``` | |||
| ```bash | |||
| # training example | |||
| Ascend: bash run_train.sh Ascend ~/hccl.json ~/imagenet/train/ ~/pretrained_ckeckpoint | |||
| ``` | |||
| ### Result | |||
| Training result will be stored in the example path. Checkpoints will be stored at `./train/device$i/` by default, and training log will be redirected to `./train/device$i/train.log` like followings. | |||
| Training result will be stored in the example path. Checkpoints will be stored at `./train/device$i/` by default, and training log will be redirected to `./train/device$i/train.log` like following. | |||
| ``` | |||
| ```bash | |||
| epoch: 1 step: 5004, loss is 4.8995576 | |||
| epoch: 2 step: 5004, loss is 3.9235563 | |||
| epoch: 3 step: 5004, loss is 3.833077 | |||
| @@ -153,7 +152,7 @@ You can start training using python or shell scripts. The usage of shell scripts | |||
| ### Launch | |||
| ``` | |||
| ```bash | |||
| # infer example | |||
| shell: | |||
| Ascend: sh run_infer.sh Ascend ~/imagenet/val/ ~/train/Resnet50-30_5004.ckpt | |||
| @@ -163,9 +162,9 @@ You can start training using python or shell scripts. The usage of shell scripts | |||
| ### Result | |||
| Inference result will be stored in the example path, you can find result like the followings in `./eval/infer.log`. | |||
| Inference result will be stored in the example path, you can find result like the following in `./eval/infer.log`. | |||
| ``` | |||
| ```bash | |||
| result: {'acc': 0.76576314102564111} | |||
| ``` | |||
| @@ -191,7 +190,7 @@ result: {'acc': 0.76576314102564111} | |||
| | Total time | 8pcs: 17 hours(30 epochs with pretrained) | | |||
| | Parameters (M) | 25.5 | | |||
| | Checkpoint for Fine tuning | 197M (.ckpt file) | | |||
| | Scripts | [resnet50-quant script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet50_quant) | | |||
| | Scripts | [resnet50-quant script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet50_quant) | | |||
| ### Inference Performance | |||
| @@ -34,7 +34,7 @@ def parse_args(): | |||
| >>> parse_args() | |||
| """ | |||
| parser = ArgumentParser(description="mindspore distributed training launch " | |||
| "helper utilty that will spawn up " | |||
| "helper utility that will spawn up " | |||
| "multiple distributed processes") | |||
| parser.add_argument("--nproc_per_node", type=int, default=1, | |||
| help="The number of processes to launch on each node, " | |||
| @@ -93,7 +93,7 @@ class DetectionEngine: | |||
| def _nms(self, dets, thresh): | |||
| """Calculate NMS.""" | |||
| # conver xywh -> xmin ymin xmax ymax | |||
| # convert xywh -> xmin ymin xmax ymax | |||
| x1 = dets[:, 0] | |||
| y1 = dets[:, 1] | |||
| x2 = x1 + dets[:, 2] | |||
| @@ -185,7 +185,7 @@ class DetectionEngine: | |||
| x_top_left = x - w / 2. | |||
| y_top_left = y - h / 2. | |||
| # creat all False | |||
| # create all False | |||
| flag = np.random.random(cls_emb.shape) > sys.maxsize | |||
| for i in range(flag.shape[0]): | |||
| c = cls_argmax[i] | |||
| @@ -59,7 +59,7 @@ cp ../*.py ./eval$3 | |||
| cp -r ../src ./eval$3 | |||
| cd ./eval$3 || exit | |||
| env > env.log | |||
| echo "start infering for device $DEVICE_ID" | |||
| echo "start inferring for device $DEVICE_ID" | |||
| python eval.py \ | |||
| --data_dir=$DATASET_PATH \ | |||
| --pretrained=$CHECKPOINT_PATH \ | |||
| @@ -94,7 +94,7 @@ def get_interp_method(interp, sizes=()): | |||
| Neighbors method. (used by default). | |||
| 4: Lanczos interpolation over 8x8 pixel neighborhood. | |||
| 9: Cubic for enlarge, area for shrink, bilinear for others | |||
| 10: Random select from interpolation method metioned above. | |||
| 10: Random select from interpolation method mentioned above. | |||
| Note: | |||
| When shrinking an image, it will generally look best with AREA-based | |||
| interpolation, whereas, when enlarging an image, it will generally look best | |||
| @@ -58,7 +58,7 @@ class YoloBlock(nn.Cell): | |||
| Args: | |||
| in_channels: Integer. Input channel. | |||
| out_chls: Interger. Middle channel. | |||
| out_chls: Integer. Middle channel. | |||
| out_channels: Integer. Output channel. | |||
| Returns: | |||
| @@ -108,7 +108,7 @@ class YOLOv3(nn.Cell): | |||
| Args: | |||
| backbone_shape: List. Darknet output channels shape. | |||
| backbone: Cell. Backbone Network. | |||
| out_channel: Interger. Output channel. | |||
| out_channel: Integer. Output channel. | |||
| Returns: | |||
| Tensor, output tensor. | |||
| @@ -44,7 +44,7 @@ def has_valid_annotation(anno): | |||
| # if all boxes have close to zero area, there is no annotation | |||
| if _has_only_empty_bbox(anno): | |||
| return False | |||
| # keypoints task have a slight different critera for considering | |||
| # keypoints task have a slight different criteria for considering | |||
| # if an annotation is valid | |||
| if "keypoints" not in anno[0]: | |||
| return True | |||
| @@ -121,7 +121,7 @@ def parse_args(): | |||
| args.rank = get_rank() | |||
| args.group_size = get_group_size() | |||
| # select for master rank save ckpt or all rank save, compatiable for model parallel | |||
| # select for master rank save ckpt or all rank save, compatible for model parallel | |||
| args.rank_save_ckpt_flag = 0 | |||
| if args.is_save_on_master: | |||
| if args.rank == 0: | |||
| @@ -140,6 +140,14 @@ def conver_training_shape(args): | |||
| training_shape = [int(args.training_shape), int(args.training_shape)] | |||
| return training_shape | |||
| def build_quant_network(network): | |||
| quantizer = QuantizationAwareTraining(bn_fold=True, | |||
| per_channel=[True, False], | |||
| symmetric=[True, False], | |||
| one_conv_fold=False) | |||
| network = quantizer.quantize(network) | |||
| return network | |||
| def train(): | |||
| """Train function.""" | |||
| @@ -168,11 +176,7 @@ def train(): | |||
| config = ConfigYOLOV3DarkNet53() | |||
| # convert fusion network to quantization aware network | |||
| if config.quantization_aware: | |||
| quantizer = QuantizationAwareTraining(bn_fold=True, | |||
| per_channel=[True, False], | |||
| symmetric=[True, False], | |||
| one_conv_fold=False) | |||
| network = quantizer.quantize(network) | |||
| network = build_quant_network(network) | |||
| network = YoloWithLossCell(network) | |||
| args.logger.info('finish get network') | |||
| @@ -198,11 +202,8 @@ def train(): | |||
| lr = get_lr(args) | |||
| opt = Momentum(params=get_param_groups(network), | |||
| learning_rate=Tensor(lr), | |||
| momentum=args.momentum, | |||
| weight_decay=args.weight_decay, | |||
| loss_scale=args.loss_scale) | |||
| opt = Momentum(params=get_param_groups(network), learning_rate=Tensor(lr), momentum=args.momentum, | |||
| weight_decay=args.weight_decay, loss_scale=args.loss_scale) | |||
| network = TrainingWrapper(network, opt) | |||
| network.set_train() | |||
| @@ -213,9 +214,7 @@ def train(): | |||
| ckpt_config = CheckpointConfig(save_checkpoint_steps=args.ckpt_interval, | |||
| keep_checkpoint_max=ckpt_max_num) | |||
| save_ckpt_path = os.path.join(args.outputs_dir, 'ckpt_' + str(args.rank) + '/') | |||
| ckpt_cb = ModelCheckpoint(config=ckpt_config, | |||
| directory=save_ckpt_path, | |||
| prefix='{}'.format(args.rank)) | |||
| ckpt_cb = ModelCheckpoint(config=ckpt_config, directory=save_ckpt_path, prefix='{}'.format(args.rank)) | |||
| cb_params = _InternalCallbackParam() | |||
| cb_params.train_network = network | |||
| cb_params.epoch_num = ckpt_max_num | |||
| @@ -4,7 +4,7 @@ | |||
| - [Model Architecture](#model-architecture) | |||
| - [Dataset](#dataset) | |||
| - [Environment Requirements](#environment-requirements) | |||
| - [Quick Start](#quick-start) | |||
| - [Quick Start](#quick-start) | |||
| - [Script Description](#script-description) | |||
| - [Script and Sample Code](#script-and-sample-code) | |||
| - [Script Parameters](#script-parameters) | |||
| @@ -19,7 +19,6 @@ | |||
| - [Description of Random Situation](#description-of-random-situation) | |||
| - [ModelZoo Homepage](#modelzoo-homepage) | |||
| # [YOLOv3_ResNet18 Description](#contents) | |||
| YOLOv3 network based on ResNet-18, with support for training and evaluation. | |||
| @@ -30,24 +29,25 @@ YOLOv3 network based on ResNet-18, with support for training and evaluation. | |||
| The overall network architecture of YOLOv3 is show below: | |||
| And we use ResNet18 as the backbone of YOLOv3_ResNet18. The architecture of ResNet18 has 4 stages. The ResNet architecture performs the initial convolution and max-pooling using 7×7 and 3×3 kernel sizes respectively. Afterward, every stage of the network has different Residual blocks (2, 2, 2, 2) containing two 3×3 conv layers. Finally, the network has an Average Pooling layer followed by a fully connected layer. | |||
| And we use ResNet18 as the backbone of YOLOv3_ResNet18. The architecture of ResNet18 has 4 stages. The ResNet architecture performs the initial convolution and max-pooling using 7×7 and 3×3 kernel sizes respectively. Afterward, every stage of the network has different Residual blocks (2, 2, 2, 2) containing two 3×3 conv layers. Finally, the network has an Average Pooling layer followed by a fully connected layer. | |||
| # [Dataset](#contents) | |||
| Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. | |||
| Dataset used: [COCO2017](<http://images.cocodataset.org/>) | |||
| Dataset used: [COCO2017](<http://images.cocodataset.org/>) | |||
| - Dataset size:19G | |||
| - Train:18G,118000 images | |||
| - Val:1G,5000 images | |||
| - Annotations:241M,instances,captions,person_keypoints etc | |||
| - Train:18G,118000 images | |||
| - Val:1G,5000 images | |||
| - Annotations:241M,instances,captions,person_keypoints etc | |||
| - Data format:image and json files | |||
| - Note:Data will be processed in dataset.py | |||
| - Dataset | |||
| 1. The directory structure is as follows: | |||
| ``` | |||
| . | |||
| ├── annotations # annotation jsons | |||
| @@ -55,7 +55,7 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>) | |||
| └── val2017 # infer dataset | |||
| ``` | |||
| 2. Organize the dataset infomation into a TXT file, each row in the file is as follows: | |||
| 2. Organize the dataset information into a TXT file, each row in the file is as follows: | |||
| ``` | |||
| train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2 | |||
| @@ -63,63 +63,61 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>) | |||
| Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. `dataset.py` is the parsing script, we read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are external inputs. | |||
| # [Environment Requirements](#contents) | |||
| - Hardware(Ascend) | |||
| - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. | |||
| - Framework | |||
| - [MindSpore](https://www.mindspore.cn/install/en) | |||
| - For more information, please check the resources below: | |||
| - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) | |||
| - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) | |||
| # [Quick Start](#contents) | |||
| After installing MindSpore via the official website, you can start training and evaluation on Ascend as follows: | |||
| After installing MindSpore via the official website, you can start training and evaluation on Ascend as follows: | |||
| - runing on Ascend | |||
| - running on Ascend | |||
| ```shell script | |||
| #run standalone training example | |||
| sh run_standalone_train.sh [DEVICE_ID] [EPOCH_SIZE] [MINDRECORD_DIR] [IMAGE_DIR] [ANNO_PATH] | |||
| #run distributed training example | |||
| sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [MINDRECORD_DIR] [IMAGE_DIR] [ANNO_PATH] [RANK_TABLE_FILE] | |||
| #run evaluation example | |||
| sh run_eval.sh [DEVICE_ID] [CKPT_PATH] [MINDRECORD_DIR] [IMAGE_DIR] [ANNO_PATH] | |||
| ``` | |||
| # [Script Description](#contents) | |||
| ## [Script and Sample Code](#contents) | |||
| ``` | |||
| ```python | |||
| └── cv | |||
| ├── README.md // descriptions about all the models | |||
| ├── mindspore_hub_conf.md // config for mindspore hub | |||
| └── yolov3_resnet18 | |||
| └── yolov3_resnet18 | |||
| ├── README.md // descriptions about yolov3_resnet18 | |||
| ├── scripts | |||
| ├── scripts | |||
| ├── run_distribute_train.sh // shell script for distributed on Ascend | |||
| ├── run_standalone_train.sh // shell script for distributed on Ascend | |||
| └── run_eval.sh // shell script for evaluation on Ascend | |||
| ├── src | |||
| ├── src | |||
| ├── dataset.py // creating dataset | |||
| ├── yolov3.py // yolov3 architecture | |||
| ├── config.py // parameter configuration | |||
| ├── config.py // parameter configuration | |||
| └── utils.py // util function | |||
| ├── train.py // training script | |||
| ├── train.py // training script | |||
| └── eval.py // evaluation script | |||
| ``` | |||
| ## [Script Parameters](#contents) | |||
| ``` | |||
| Major parameters in train.py and config.py as follows: | |||
| ```python | |||
| device_num: Use device nums, default is 1. | |||
| lr: Learning rate, default is 0.001. | |||
| epoch_size: Epoch size, default is 50. | |||
| @@ -133,24 +131,23 @@ After installing MindSpore via the official website, you can start training and | |||
| img_shape: Image height and width used as input to the model. | |||
| ``` | |||
| ## [Training Process](#contents) | |||
| ### Training on Ascend | |||
| To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and `mindrecord_dir`. If the `mindrecord_dir` is empty, it wil generate [mindrecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html) file by `image_dir` and `anno_path`(the absolute image path is joined by the `image_dir` and the relative path in `anno_path`). **Note if `mindrecord_dir` isn't empty, it will use `mindrecord_dir` rather than `image_dir` and `anno_path`.** | |||
| - Stand alone mode | |||
| ``` | |||
| ```bash | |||
| sh run_standalone_train.sh 0 50 ./Mindrecord_train ./dataset ./dataset/train.txt | |||
| ``` | |||
| The input variables are device id, epoch size, mindrecord directory path, dataset directory path and train TXT file path. | |||
| - Distributed mode | |||
| ``` | |||
| ```bash | |||
| sh run_distribute_train.sh 8 150 /data/Mindrecord_train /data /data/train.txt /data/hccl.json | |||
| ``` | |||
| @@ -158,7 +155,7 @@ To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and | |||
| You will get the loss value and time of each step as following: | |||
| ``` | |||
| ```bash | |||
| epoch: 145 step: 156, loss is 12.202981 | |||
| epoch time: 25599.22742843628, per step time: 164.0976117207454 | |||
| epoch: 146 step: 156, loss is 16.91706 | |||
| @@ -168,19 +165,20 @@ You will get the loss value and time of each step as following: | |||
| epoch: 148 step: 156, loss is 10.431475 | |||
| epoch time: 23634.241580963135, per step time: 151.50154859591754 | |||
| epoch: 149 step: 156, loss is 14.665991 | |||
| epoch time: 24118.8325881958, per step time: 154.60790120638333 | |||
| epoch time: 24118.8325881958, per step time: 154.60790120638333 | |||
| epoch: 150 step: 156, loss is 10.779521 | |||
| epoch time: 25319.57221031189, per step time: 162.30495006610187 | |||
| ``` | |||
| Note the results is two-classification(person and face) used our own annotations with coco2017, you can change `num_classes` in `config.py` to train your dataset. And we will suport 80 classifications in coco2017 the near future. | |||
| Note the results is two-classification(person and face) used our own annotations with coco2017, you can change `num_classes` in `config.py` to train your dataset. And we will support 80 classifications in coco2017 the near future. | |||
| ## [Evaluation Process](#contents) | |||
| ### Evaluation on Ascend | |||
| To eval, run `eval.py` with the dataset `image_dir`, `anno_path`(eval txt), `mindrecord_dir` and `ckpt_path`. `ckpt_path` is the path of [checkpoint](https://www.mindspore.cn/tutorial/training/en/master/use/save_model.html) file. | |||
| ``` | |||
| ```bash | |||
| sh run_eval.sh 0 yolo.ckpt ./Mindrecord_eval ./dataset ./dataset/eval.txt | |||
| ``` | |||
| @@ -188,18 +186,18 @@ The input variables are device id, checkpoint path, mindrecord directory path, d | |||
| You will get the precision and recall value of each class: | |||
| ``` | |||
| ```bash | |||
| class 0 precision is 88.18%, recall is 66.00% | |||
| class 1 precision is 85.34%, recall is 79.13% | |||
| ``` | |||
| Note the precision and recall values are results of two-classification(person and face) used our own annotations with coco2017. | |||
| # [Model Description](#contents) | |||
| ## [Performance](#contents) | |||
| ### Evaluation Performance | |||
| ### Evaluation Performance | |||
| | Parameters | Ascend | | |||
| | -------------------------- | ----------------------------------------------------------- | | |||
| @@ -217,7 +215,6 @@ Note the precision and recall values are results of two-classification(person an | |||
| | Parameters (M) | 189 | | |||
| | Scripts | [yolov3_resnet18 script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_resnet18) | [yolov3_resnet18 script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_resnet18) | | |||
| ### Inference Performance | |||
| | Parameters | Ascend | | |||
| @@ -233,9 +230,9 @@ Note the precision and recall values are results of two-classification(person an | |||
| # [Description of Random Situation](#contents) | |||
| In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. | |||
| In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. | |||
| # [ModelZoo Homepage](#contents) | |||
| # [ModelZoo Homepage](#contents) | |||
| Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). | |||
| @@ -15,7 +15,7 @@ | |||
| # ============================================================================ | |||
| echo "=======================================================================================================================================================" | |||
| echo "Please run the scipt as: " | |||
| echo "Please run the script as: " | |||
| echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE MINDRECORD_DIR IMAGE_DIR ANNO_PATH RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE" | |||
| echo "For example: sh run_distribute_train.sh 8 150 /data/Mindrecord_train /data /data/train.txt /data/hccl.json /opt/yolov3-150.ckpt(optional) 100(optional)" | |||
| echo "It is better to use absolute path." | |||
| @@ -47,7 +47,7 @@ then | |||
| exit 1 | |||
| fi | |||
| echo "After running the scipt, the network runs in the background. The log will be generated in LOGx/log.txt" | |||
| echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt" | |||
| export RANK_TABLE_FILE=$6 | |||
| export RANK_SIZE=$1 | |||
| @@ -15,7 +15,7 @@ | |||
| # ============================================================================ | |||
| echo "==============================================================================================================" | |||
| echo "Please run the scipt as: " | |||
| echo "Please run the script as: " | |||
| echo "sh run_eval.sh DEVICE_ID CKPT_PATH MINDRECORD_DIR IMAGE_DIR ANNO_PATH" | |||
| echo "for example: sh run_eval.sh 0 yolo.ckpt ./Mindrecord_eval ./dataset ./dataset/eval.txt" | |||
| echo "==============================================================================================================" | |||
| @@ -15,7 +15,7 @@ | |||
| # ============================================================================ | |||
| echo "=========================================================================================================================================" | |||
| echo "Please run the scipt as: " | |||
| echo "Please run the script as: " | |||
| echo "sh run_standalone_train.sh DEVICE_ID EPOCH_SIZE MINDRECORD_DIR IMAGE_DIR ANNO_PATH PRE_TRAINED PRE_TRAINED_EPOCH_SIZE" | |||
| echo "for example: sh run_standalone_train.sh 0 50 ./Mindrecord_train ./dataset ./dataset/train.txt /opt/yolov3-50.ckpt(optional) 30(optional)" | |||
| echo "=========================================================================================================================================" | |||