Merge pull request #38 from JimmyYang20/main

Fix examples: joint inference/federated learning
4 years ago · 0428d69259
--- a/examples/federated_learning/surface_defect_detection/README.md
+++ b/examples/federated_learning/surface_defect_detection/README.md
@@ -5,31 +5,51 @@ Using Federated Learning, we can solve the problem. Each place uses its own data
 ## Surface Defect Detection Experiment
 > Assume that there are two edge nodes (edge1 and edge2) and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues.
 > Assume that there are two edge nodes and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues.
 > Base on this scenario, we will demonstrate the surface inspection.
 ### Prepare Nodes
 ```
 CLOUD_NODE="cloud-node-name"
 EDGE1_NODE="edge1-node-name"
 EDGE2_NODE="edge2-node-name"
 ```
 ### Install Sedna
 Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.
 ### Prepare Dataset
 Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/1.txt) to `/data` of edge1.  
 Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/2.txt) to `/data` of edge2.
 ### Prepare Script
 Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/1.txt) to `/data` of ```EDGE1_NODE```.  
 ```
 mkdir -p /data
 cd /data
 git clone https://github.com/abin24/Magnetic-tile-defect-datasets..git Magnetic-tile-defect-datasets
 curl -o 1.txt https://raw.githubusercontent.com/kubeedge/sedna/main/examples/federated_learning/surface_defect_detection/data/1.txt
 ```
 Download the script [aggregate.py](/examples/federated_learning/surface_defect_detection/aggregation_worker/aggregate.py) to the `/code` of cloud node.
 Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/2.txt) to `/data` of ```EDGE2_NODE```.
 ```
 mkdir -p /data
 cd /data
 git clone https://github.com/abin24/Magnetic-tile-defect-datasets..git Magnetic-tile-defect-datasets
 curl -o 2.txt https://raw.githubusercontent.com/kubeedge/sedna/main/examples/federated_learning/surface_defect_detection/data/2.txt
 ```
 Download the script [training_worker](/examples/federated_learning/surface_defect_detection/training_worker/train.py) to the `/code` of edge1 and edge2.
 ### Prepare Images
 This example uses these images:
 1. aggregation worker: ```kubeedge/sedna-example-federated-learning-surface-defect-detection-aggregation:v0.1.0```
 2. train worker: ```kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.1.0```  
 These images are generated by the script [build_images.sh](/examples/build_image.sh).
 ### Create Federated Learning Job 
 #### Create Dataset
 create dataset for `$EDGE1_NODE`
 ```
 # create dataset for edge1
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Dataset
@@ -38,10 +58,12 @@ metadata:
 spec:
  url: "/data/1.txt"
  format: "txt"
  nodeName: "edge1"
  nodeName: $EDGE1_NODE
 EOF
 ```
 # create dataset for edge2
 create dataset for `$EDGE2_NODE`
 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
 kind: Dataset
@@ -50,12 +72,22 @@ metadata:
 spec:
  url: "/data/2.txt"
  format: "txt"
  nodeName: "edge2"
  nodeName: $EDGE2_NODE
 EOF
 ```
 #### Create Model
 create the directory `/model` in the host of `$EDGE1_NODE`
 ```
 mkdir /model
 ```
 create the directory `/model` in the host of `$EDGE2_NODE`
 ```
 mkdir /model
 ```
 create model
 ```
 kubectl create -f - <<EOF
 apiVersion: sedna.io/v1alpha1
@@ -80,46 +112,58 @@ spec:
  aggregationWorker:
    model:
      name: "surface-defect-detection-model"
    nodeName: "cloud0"
    workerSpec:
      scriptDir: "/code"
      scriptBootFile: "aggregate.py"
      frameworkType: "tensorflow"
      frameworkVersion: "2.3"
      parameters:
        - key: "exit_round"
          value: "3"
    template:
      spec:
        nodeName: $CLOUD_NODE
        containers:
          - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-aggregation:v0.1.0
            name:  agg-worker
            imagePullPolicy: IfNotPresent
            env: # user defined environments
              - name: "exit_round"
                value: "3"
            resources:  # user defined resources
              limits:
                memory: 2Gi
  trainingWorkers:
    - nodeName: "edge1"
      dataset:
        name: "edge-1-surface-defect-detection-dataset"
      workerSpec:
        scriptDir: "/code"
        scriptBootFile: "train.py"
        frameworkType: "tensorflow"
        frameworkVersion: "2.3"
        parameters:
          - key: "batch_size"
            value: "32"
          - key: "learning_rate"
            value: "0.001"
          - key: "epochs"
            value: "1"
    - nodeName: "edge2"
      dataset:
          name: "edge-2-surface-defect-detection-dataset"
      workerSpec:
        scriptDir: "/code"
        scriptBootFile: "train.py"
        frameworkType: "tensorflow"
        frameworkVersion: "2.3"
        parameters:
          - key: "batch_size"
            value: "32"
          - key: "learning_rate"
            value: "0.001"
          - key: "epochs"
            value: "1"
    - dataset:
        name: "edge1-surface-defect-detection-dataset"
      template:
        spec:
          nodeName: $EDGE1_NODE
          containers:
            - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.1.0
              name:  train-worker
              imagePullPolicy: IfNotPresent
              env:  # user defined environments
                - name: "batch_size"
                  value: "32"
                - name: "learning_rate"
                  value: "0.001"
                - name: "epochs"
                  value: "2"
              resources:  # user defined resources
                limits:
                  memory: 2Gi
    - dataset:
        name: "edge2-surface-defect-detection-dataset"
      template:
        spec:
          nodeName: $EDGE2_NODE
          containers:
            - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.1.0
              name:  train-worker
              imagePullPolicy: IfNotPresent
              env:  # user defined environments
                - name: "batch_size"
                  value: "32"
                - name: "learning_rate"
                  value: "0.001"
                - name: "epochs"
                  value: "2"
              resources:  # user defined resources
                limits:
                  memory: 2Gi
 EOF
 ```
@@ -130,4 +174,4 @@ kubectl get federatedlearningjob surface-defect-detection
 ```
 ### Check Federated Learning Train Result
 After the job completed, you will find the model generated on the path `/model` in edge1 and edge2.
 After the job completed, you will find the model generated on the directory `/model` in `$EDGE1_NODE` and `EDGE2_NODE`.
--- a/examples/joint_inference/helmet_detection_inference/README.md
+++ b/examples/joint_inference/helmet_detection_inference/README.md
@@ -32,22 +32,12 @@ wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection
 tar -zxvf big-model.tar.gz
 ```
 ### Prepare Script
 * step1: download the script [little_model.py](/examples/joint_inference/helmet_detection_inference/little_model/little_model.py) to the path `/code/little-model` of edge node.
 ### Prepare Images
 This example uses these images:
 1. little model inference worker: ```kubeedge/sedna-example-joint-inference-helmet-detection-little:v0.1.0```  
 2. big model inference worker: ```kubeedge/sedna-example-joint-inference-helmet-detection-big:v0.1.0```    
 ```
 mkdir -p /code/little-model
 cd /code/little-model
 curl -o little_model.py https://raw.githubusercontent.com/kubeedge/sedna/main/examples/joint_inference/helmet_detection_inference/little_model/little_model.py
 ```
 * step2: download the script [big_model.py](/examples/joint_inference/helmet_detection_inference/big_model/big_model.py) to the path `/code/big-model` of cloud node.
 ```
 mkdir -p /code/big-model
 cd /code/big-model
 curl -o big_model.py https://raw.githubusercontent.com/kubeedge/sedna/main/examples/joint_inference/helmet_detection_inference/big_model/big_model.py
 ```
 These images are generated by the script [build_images.sh](/examples/build_image.sh).
 ### Create Joint Inference Service 
@@ -90,6 +80,12 @@ Note the setting of the following parameters, which have to same as the script [
 - hard_example_edge_inference_output: set your output path for results of inferring hard examples in edge side.
 - hard_example_cloud_inference_output: set your output path for results of inferring hard examples in cloud side.
 Make preparation in edge node
 ```
 mkdir -p /joint_inference/output
 ```
 Create joint inference service
 ```
 CLOUD_NODE="cloud-node-name"
 EDGE_NODE="edge-node-name"
@@ -105,42 +101,63 @@ spec:
  edgeWorker:
    model:
      name: "helmet-detection-inference-little-model"
    nodeName: $EDGE_NODE
    hardExampleMining:
      name: "IBT"
      parameters:
        - key: "threshold_img"
          value: "0.5"
          value: "0.9"
        - key: "threshold_box"
          value: "0.5"
    workerSpec:
      scriptDir: "/code/little-model"
      scriptBootFile: "little_model.py"
      frameworkType: "tensorflow"
      frameworkVersion: "1.15"
      parameters:
        - key: "input_shape"
          value: "416,736"
        - key: "video_url"
          value: "rtsp://localhost/video"
        - key: "all_examples_inference_output"
          value: "/home/data/model/little-model/output"
        - key: "hard_example_cloud_inference_output"
          value: "/home/data/model/little-model/hard_example_cloud_inference_output"
        - key: "hard_example_edge_inference_output"
          value: "/home/data/model/little-model/hard_example_edge_inference_output"
          value: "0.9"
    template:
      spec:
        nodeName: $EDGE_NODE
        containers:
        - image: kubeedge/sedna-example-joint-inference-helmet-detection-little:v0.1.0
          imagePullPolicy: IfNotPresent
          name:  little-model
          env:  # user defined environments
          - name: input_shape
            value: "416,736"
          - name: "video_url"
            value: "rtsp://localhost/video"
          - name: "all_examples_inference_output"
            value: "/data/output"
          - name: "hard_example_cloud_inference_output"
            value: "/data/hard_example_cloud_inference_output"
          - name: "hard_example_edge_inference_output"
            value: "/data/hard_example_edge_inference_output"
          resources:  # user defined resources
            requests:
              memory: 64M
              cpu: 100m
            limits:
              memory: 2Gi
          volumeMounts:
            - name: outputdir
              mountPath: /data/
        volumes:   # user defined volumes
          - name: outputdir
            hostPath:
              # user must create the directory in host
              path: /joint_inference/output
              type: Directory
  cloudWorker:
    model:
      name: "helmet-detection-inference-big-model"
    nodeName: $CLOUD_NODE
    workerSpec:
      scriptDir: "/code/big-model"
      scriptBootFile: "big_model.py"
      frameworkType: "tensorflow"
      frameworkVersion: "1.15"
      parameters:
        - key: "input_shape"
          value: "544,544"
    template:
      spec:
        nodeName: $CLOUD_NODE
        containers:
          - image: kubeedge/sedna-example-joint-inference-helmet-detection-big:v0.1.0
            name:  big-model
            imagePullPolicy: IfNotPresent
            env:  # user defined environments
              - name: "input_shape"
                value: "544,544"
            resources:  # user defined resources
              requests:
                memory: 2Gi
 EOF
 ```
@@ -173,7 +190,7 @@ ffmpeg -re -i /data/video/video.mp4 -vcodec libx264 -f rtsp rtsp://localhost/vid
 ### Check Inference Result
 You can check the inference results in the output path (e.g., `/output`) defined in the JointInferenceService config.
 You can check the inference results in the output path (e.g. `/output`) defined in the JointInferenceService config.
 * the result of edge inference vs the result of joint inference
 ![](images/inference-result.png)