docs: enhance example of incremental learning

Signed-off-by: TymonXie <tymonxie@gmail.com>
4 years ago · ee71aedec8
--- a/README.md
+++ b/README.md
@@ -73,7 +73,7 @@ Follow the [Sedna installation document](docs/setup/install.md) to install Sedna

 ### Examples
 Example1：[Using Joint Inference Service in Helmet Detection Scenario](/examples/joint_inference/helmet_detection_inference/README.md).  
 Example2：[Using Incremental Learning Job in Helmet Detection Scenario](/examples/incremental_learning/helmet_detection_incremental_train/README.md).  
 Example2：[Using Incremental Learning Job in Helmet Detection Scenario](/examples/incremental_learning/helmet_detection/README.md).  
 Example3：[Using Federated Learning Job in Surface Defect Detection Scenario](/examples/federated_learning/surface_defect_detection/README.md).  
 ## Roadmap

--- a/README_zh.md
+++ b/README_zh.md
@@ -61,7 +61,7 @@ Sedna的安装文档请参考[这里](/docs/setup/install.md)。

 ### 样例
 样例1：[大小模型协同推理](/examples/joint_inference/helmet_detection_inference/README.md)   
 样例2：[边云协同增量学习](/examples/incremental_learning/helmet_detection_incremental_train/README.md)    
 样例2：[边云协同增量学习](/examples/incremental_learning/helmet_detection/README.md)    
 样例3：[边云协同联邦学习](/examples/federated_learning/surface_defect_detection/README.md)  
 ## 路标

--- a/examples/incremental-learning-helmet-detection.Dockerfile
+++ b/examples/incremental-learning-helmet-detection.Dockerfile
@@ -16,7 +16,7 @@ ENV PYTHONPATH "/home/lib"
 WORKDIR /home/work
 COPY ./lib /home/lib

 COPY examples/incremental_learning/helmet_detection_incremental_train/training/  /home/work/
 COPY examples/incremental_learning/helmet_detection/training/  /home/work/


 ENTRYPOINT ["python"]
--- a/examples/incremental_learning/helmet_detection/README.md
+++ b/examples/incremental_learning/helmet_detection/README.md
@@ -0,0 +1,323 @@
 # Using Incremental Learning Job in Helmet Detection Scenario

 This document introduces how to use incremental learning job in helmet detectioni scenario. 
 Using the incremental learning job, our application can automatically retrains, evaluates, 
 and updates models based on the data generated at the edge.

 ## Helmet Detection Experiment


 ### Install Sedna

 Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.

 ### Prepare Model
 In this example, we need to prepare base model and deploy model in advance.



 download [models](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/model.tar.gz), including base model and deploy model.
 ```
 cd /
 wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/models.tar.gz
 tar -zxvf models.tar.gz
 ```
 ### Prepare for Inference Worker
 in this example, we simulate a inference worker for helmet detection, the worker will upload hard examples to `HE_SAVED_URL`, while
 it inferences data from local video. we need to make following preparations:
 * make sure following localdirs exist
 ```
 mkdir -p /incremental_learning/video/
 mkdir -p /incremental_learning/he/
 mkdir -p /data/helmet_detection
 mkdir /output

 ```
 * download [video](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz), unzip video.tar.gz, and put it into `/incremental_learning/video/`

 ```
 cd /incremental_learning/video/
 wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz
 tar -zxvf video.tar.gz
 ```
 ### Prepare Image
 this example use the image:  
 ```
 kubeedge/sedna-example-incremental-learning-helmet-detection:v0.1.0
 ```
 This image is generated by the script [build_images.sh](/examples/build_image.sh), used for creating training, eval and inference worker.

 ### Create Incremental Job
 in this example, `$WORKER_NODE` is a custom node, you can fill it which you actually run.


 ```
 WORKER_NODE="edge-node" 
 ```
 Create Dataset

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Dataset
 metadata:
  name: incremental-dataset
 spec:
  url: "/data/helmet_detection/train_data/train_data.txt"
  format: "txt"
  nodeName: $WORKER_NODE
 EOF
 ```

 Create Initial Model to simulate the initial model in incremental learning scenario.

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Model
 metadata:
  name: initial-model
 spec:
  url : "/models/base_model"
  format: "ckpt"
 EOF
 ```

 Create Deploy Model

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Model
 metadata:
  name: deploy-model
 spec:
  url : "/models/deploy_model/saved_model.pb"
  format: "pb"
 EOF
 ```

 Start The Incremental Learning Job

 ```

 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: IncrementalLearningJob
 metadata:
  name: helmet-detection-demo
 spec:
  initialModel:
    name: "initial-model"
  dataset:
    name: "incremental-dataset"
    trainProb: 0.8
  trainSpec:
    template:
      spec:
        nodeName: $WORKER_NODE
        containers:
          - image: kubeedge/sedna-example-incremental-learning-helmet-detection:v0.1.0
            name:  train-worker
            imagePullPolicy: IfNotPresent
            args: ["train.py"]
            env:
              - name: "batch_size"
                value: "32"
              - name: "epochs"
                value: "1"
              - name: "input_shape"
                value: "352,640"
              - name: "class_names"
                value: "person,helmet,helmet-on,helmet-off"
              - name: "nms_threshold"
                value: "0.4"
              - name: "obj_threshold"
                value: "0.3"
    trigger:
      checkPeriodSeconds: 60
      timer:
        start: 02:00
        end: 20:00
      condition:
        operator: ">"
        threshold: 500
        metric: num_of_samples
  evalSpec:
    template:
      spec:
        nodeName: $WORKER_NODE
        containers:
          - image: kubeedge/sedna-example-incremental-learning-helmet-detection:v0.1.0
            name:  eval-worker
            imagePullPolicy: IfNotPresent
            args: ["eval.py"]
            env:
              - name: "input_shape"
                value: "352,640"
              - name: "class_names"
                value: "person,helmet,helmet-on,helmet-off"                    
  deploySpec:
    model:
      name: "deploy-model"
    trigger:
      condition:
        operator: ">"
        threshold: 0.1
        metric: precision_delta
    hardExampleMining:
      name: "IBT"
      parameters:
        - key: "threshold_img"
          value: "0.9"
        - key: "threshold_box"
          value: "0.9"
    template:
      spec:
        nodeName: $WORKER_NODE
        containers:
        - image: kubeedge/sedna-example-incremental-learning-helmet-detection:v0.1.0
          name:  infer-worker
          imagePullPolicy: IfNotPresent
          args: ["inference.py"]
          env:
            - name: "input_shape"
              value: "352,640"
            - name: "video_url"
              value: "file://video/video.mp4"
            - name: "HE_SAVED_URL" 
              value: "/he_saved_url"
          volumeMounts:
          - name: localvideo
            mountPath: /video/
          - name: hedir
            mountPath: /he_saved_url
          resources:  # user defined resources
            limits:
              memory: 2Gi
        volumes:   # user defined volumes
          - name: localvideo
            hostPath:
              path: /incremental_learning/video/
              type: Directory
          - name: hedir
            hostPath:
              path:  /incremental_learning/he/
              type: Directory
  outputDir: "/output"
 EOF
 ```
 1. The `Dataset` describes data with labels and `HE_SAVED_URL` indicates the address of the deploy container for uploading hard examples. Users will mark label for the hard examples in the address.
 2. Ensure that the path of outputDir in the YAML file exists on your node. This path will be directly mounted to the container.


 ### Check Incremental Learning Job
 query the service status
 ```
 kubectl get incrementallearningjob helmet-detection-demo
 ```
 In the `IncrementalLearningJob` resource helmet-detection-demo, the following trigger is configured:
 ```
 trigger:
  checkPeriodSeconds: 60
  timer:
    start: 02:00
    end: 20:00
  condition:
    operator: ">"
    threshold: 500
    metric: num_of_samples
 ```

 ### Hard Example Labeling
 In a real word, we need to label the hard examples in `HE_SAVED_URL`  with annotation tools and then put the examples to `Dataset`'s url.  

 you can use Open-Source annotation tools to label hard examples, such as [MAKE SENSE](https://www.makesense.ai), which has following main advantages:  
 * Open source and free to use under GPLv3 license   
 * Support outputfile formats like YOLO, VOC XML, VGG JSON, CSV
 * No advanced installation required, just open up your browser
 * Use AI to make your work more productive
 * Offline running as a container, ensuring data security  

 ![img.png](image/make-sense.png)  

 the details labeling are not described here, main steps in this demo are as follows:
 * import unlabeled hard example to anonotation tools 
 ![img_1.png](image/label-interface.png)
 * label and export annotations 
 ![img_2.png](image/export-label.png)![img_3.png](image/label-result.png)
  
 * you will get YOLO format annotations, so you need convert them to the type which can be used by your own training code. in this example, the following scripts are
  provided for reference:  

 ```
 import os

 annotation_dir_path = "C:/Users/Administrator/Desktop/labeled_data"
 save_path = "C:/Users/Administrator/Desktop/labeled_data/save_label.txt"

 def convert_single_line(line):
    line_list = []
    line = line.split(" ")
    for i in range(1, len(line)):
        line[i] = float(line[i])
        line[i] = line[i] * 1000
        line_list.append(str(int(line[i])))
    line_list.append(line[0])
    return ",".join(line_list)

 if __name__ == '__main__':
    results = []
    g = os.walk(annotation_dir_path)
    for path, dir_list, file_list in g:
        for file_name in file_list:
            file_path = os.path.join(path, file_name)
            file_name = file_name.split("txt")
            file_name = file_name[0] + 'jpg'
            single_label_string = file_name
            f = open(file_path)
            lines = f.readlines()
            for line in lines:
                line = line.strip('\n')
                single_label_string = single_label_string + " " + convert_single_line(line)
            results.append(single_label_string)
    save_file = open(save_path, "w")
    for result in results:
        save_file.write(result + "\n")
    save_file.close()
 ```
 How to use:  
 `annotation_dir_path`: location for labeled annotations from MAKESENSE   
 `save_path`:  location for label txt which converted from annotations
 * run above script, you can get a txt which includes all label information
 * put the text with examples in the same dir
 * you will get labeled examples which meet training requirements
  ![img_1.png](image/reverted_label.png)  
 * put these examples and annotations above to `Dataset`'s url  

 Without annotation tools, we can simulate the condition of `num_of_samples` in the following ways:  
 Download [dataset](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz) to $WORKER_NODE.
 ```
 cd /data/helmet_detection
 wget  https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
 tar -zxvf dataset.tar.gz
 ```
 The LocalController component will check the number of the sample, realize trigger conditions are met and notice the GlobalManager Component to start train worker.
 When the train worker finish, we can view the updated model in the `/output` directory in `$WORKER_NODE` node.
 Then the eval worker will start to evaluate the model that train worker generated.

 If the eval result satisfy the `deploySpec`'s trigger 
 ```
 trigger:
  condition:
    operator: ">"
    threshold: 0.1
    metric: precision_delta
 ```
 the deploy worker will load the new model and provide service.

 ### Effect Display  
 in this example, false and failed detections occur at stage of inference before incremental learning, after incremental learning, 
 all targets are correctly detected.

 ![img_1.png](image/effect_comparison.png) 
--- a/examples/incremental_learning/helmet_detection/image/effect_comparison.png
+++ b/examples/incremental_learning/helmet_detection/image/effect_comparison.png
--- a/examples/incremental_learning/helmet_detection/image/export-label.png
+++ b/examples/incremental_learning/helmet_detection/image/export-label.png
--- a/examples/incremental_learning/helmet_detection/image/label-interface.png
+++ b/examples/incremental_learning/helmet_detection/image/label-interface.png
--- a/examples/incremental_learning/helmet_detection/image/label-result.png
+++ b/examples/incremental_learning/helmet_detection/image/label-result.png
--- a/examples/incremental_learning/helmet_detection/image/make-sense.png
+++ b/examples/incremental_learning/helmet_detection/image/make-sense.png
--- a/examples/incremental_learning/helmet_detection/image/reverted_label.png
+++ b/examples/incremental_learning/helmet_detection/image/reverted_label.png
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/data_gen.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/data_gen.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/eval.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/eval.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/inference.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/inference.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/interface.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/interface.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/resnet18.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/resnet18.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/train.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/train.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/validate_utils.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/validate_utils.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/training/yolo3_multiscale.py
+++ b/examples/incremental_learning/helmet_detection_incremental_train/training/yolo3_multiscale.py
--- a/examples/incremental_learning/helmet_detection_incremental_train/README.md
+++ b/examples/incremental_learning/helmet_detection_incremental_train/README.md
@@ -1,239 +0,0 @@
 # Using Incremental Learning Job in Helmet Detection Scenario

 This document introduces how to use incremental learning job in helmet detectioni scenario. 
 Using the incremental learning job, our application can automatically retrains, evaluates, 
 and updates models based on the data generated at the edge.

 ## Helmet Detection Experiment

 ### Prepare Worker Image
 Build the worker image by referring to the [dockerfile](/build/worker/base_images/tensorflow/tensorflow-1.15.Dockerfile)
 and put the image to the `gm-config.yaml`'s  `imageHub` in [Install Sedna](#install-sedna)
 In this demo, we need to replace the requirement.txt to
 ```
 flask==1.1.2
 keras==2.4.3
 opencv-python==4.4.0.44
 websockets==8.1
 Pillow==8.0.1
 requests==2.24.0
 tqdm==4.56.0
 matplotlib==3.3.3
 ```
 ### Install Sedna

 Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.

 ### Prepare Data and Model

 * step 1: download [dataset](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz)
 ```
 mkdir -p /data/helmet_detection
 wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
 tar -zxvf dataset.tar.gz
 ```

 * step 2: download [base model](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/model.tar.gz)
 ```
 mkdir /model
 cd /model
 wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/model.tar.gz
 tar -zxvf model.tar.gz
 ```
 ### Prepare Script
 Download the [scripts](/examples/incremental_learning/helmet_detection_incremental_train/training) to the path `code` of your node


 ### Create Incremental Job

 Create Namespace `kubectl create ns sedna-test`

 Create Dataset

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Dataset
 metadata:
  name: incremental-dataset
  namespace: sedna-test
 spec:
  url: "/data/helmet_detection/train_data/train_data.txt"
  format: "txt"
  nodeName: "cloud0"
 EOF
 ```

 Create Initial Model to simulate the initial model in incremental learning scenario.

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Model
 metadata:
  name: initial-model
  namespace: sedna-test
 spec:
  url : "/model/base_model"
  format: "ckpt"
 EOF
 ```

 Create Deploy Model

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Model
 metadata:
  name: deploy-model
  namespace: sedna-test
 spec:
  url : "/model/deploy_model/saved_model.pb"
  format: "pb"
 EOF
 ```

 Start The Incremental Learning Job

 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: IncrementalLearningJob
 metadata:
  name: helmet-detection-demo
  namespace: sedna-test
 spec:
  initialModel:
    name: "initial-model"
  dataset:
    name: "incremental-dataset"
    trainProb: 0.8
  trainSpec:
    workerSpec:
      scriptDir: "/code"
      scriptBootFile: "train.py"
      frameworkType: "tensorflow"
      frameworkVersion: "1.15"
      parameters:
        - key: "batch_size"
          value: "32"
        - key: "epochs"
          value: "1"
        - key: "input_shape"
          value: "352,640"
        - key: "class_names"
          value: "person,helmet,helmet-on,helmet-off"
        - key: "nms_threshold"
          value: "0.4"
        - key: "obj_threshold"
          value: "0.3"
    trigger:
      checkPeriodSeconds: 60
      timer:
        start: 02:00
        end: 04:00
      condition:
        operator: ">"
        threshold: 500
        metric: num_of_samples
  evalSpec:
    workerSpec:
      scriptDir: "/code"
      scriptBootFile: "eval.py"
      frameworkType: "tensorflow"
      frameworkVersion: "1.15"
      parameters:
        - key: "input_shape"
          value: "352,640"
        - key: "class_names"
          value: "person,helmet,helmet-on,helmet-off"
  deploySpec:
    model:
      name: "deploy-model"
    trigger:
      condition:
        operator: ">"
        threshold: 0.1
        metric: precision_delta
    nodeName: "cloud0"
    hardExampleMining:
      name: "IBT"
    workerSpec:
      scriptDir: "/code"
      scriptBootFile: "inference.py"
      frameworkType: "tensorflow"
      frameworkVersion: "1.15"
      parameters:
        - key: "input_shape"
          value: "352,640"
        - key: "video_url"
          value: "rtsp://localhost/video"
        - key: "HE_SAVED_URL" 
          value: "/he_saved_url"
  nodeName: "cloud0"
  outputDir: "/output"
 EOF
 ```
 1. The `Dataset` describes data with labels and `HE_SAVED_URL` indicates the address of the deploy container for uploading hard examples. Users will mark label for the hard examples in the address.
 2. Ensure that the path of outputDir in the YAML file exists on your node. This path will be directly mounted to the container.



 ### Mock Video Stream for Inference in Edge Side

 * step 1: install the open source video streaming server [EasyDarwin](https://github.com/EasyDarwin/EasyDarwin/tree/dev).
 * step 2: start EasyDarwin server.
 * step 3: download [video](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz).
 * step 4: push a video stream to the url (e.g., `rtsp://localhost/video`) that the inference service can connect.

 ```
 wget https://github.com/EasyDarwin/EasyDarwin/releases/download/v8.1.0/EasyDarwin-linux-8.1.0-1901141151.tar.gz --no-check-certificate
 tar -zxvf EasyDarwin-linux-8.1.0-1901141151.tar.gz
 cd EasyDarwin-linux-8.1.0-1901141151
 ./start.sh

 mkdir -p /data/video
 cd /data/video
 tar -zxvf video.tar.gz
 ffmpeg -re -i /data/video/helmet-detection.mp4 -vcodec libx264 -f rtsp rtsp://localhost/video
 ```

 ### Check Incremental Learning Job
 query the service status
 ```
 kubectl get incrementallearningjob helmet-detection-demo -n sedna-test
 ```
 In the `IncrementalLearningJob` resource helmet-detection-demo, the following trigger is configured:
 ```
 trigger:
  checkPeriodSeconds: 60
  timer:
    start: 02:00
    end: 04:00
  condition:
    operator: ">"
    threshold: 500
    metric: num_of_samples
 ```
 In a real word, we need to label the hard examples in `HE_SAVED_URL`  with annotation tools and then put the examples to `Dataset`'s url.   
 Without annotation tools, we can simulate the condition of `num_of_samples` in the following ways:  
 Download [dataset](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz) to our cloud0 node.
 ```
 cd /data/helmet_detection
 wget  https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
 tar -zxvf dataset.tar.gz
 ```
 The LocalController component will check the number of the sample, realize trigger conditions are met and notice the GlobalManager Component to start train worker.
 When the train worker finish, we can view the updated model in the `/output` directory in cloud0 node.
 Then the eval worker will start to evaluate the model that train worker generated.

 If the eval result satisfy the `deploySpec`'s trigger 
 ```
 trigger:
  condition:
    operator: ">"
    threshold: 0.1
    metric: precision_delta
 ```
 the deploy worker will load the new model and provide service.