You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 5.6 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
  1. # Using Federated Learning Job in Surface Defect Detection Scenario
  2. This case introduces how to use federated learning job in surface defect detection scenario.
  3. In the safety surface defect detection, data is scattered in different places (such as server node, camera or others) and cannot be aggregated due to data privacy and bandwidth. As a result, we cannot use all the data for training.
  4. Using Federated Learning, we can solve the problem. Each place uses its own data for model training ,uploads the weight to the cloud for aggregation, and obtains the aggregation result for model update.
  5. ## Surface Defect Detection Experiment
  6. > Assume that there are two edge nodes and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues.
  7. > Base on this scenario, we will demonstrate the surface inspection.
  8. ### Prepare Nodes
  9. ```
  10. CLOUD_NODE="cloud-node-name"
  11. EDGE1_NODE="edge1-node-name"
  12. EDGE2_NODE="edge2-node-name"
  13. ```
  14. ### Install Sedna
  15. Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.
  16. ### Prepare Dataset
  17. Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/1.txt) to `/data` of ```EDGE1_NODE```.
  18. ```
  19. mkdir -p /data
  20. cd /data
  21. git clone https://github.com/abin24/Magnetic-tile-defect-datasets..git Magnetic-tile-defect-datasets
  22. curl -o 1.txt https://raw.githubusercontent.com/kubeedge/sedna/main/examples/federated_learning/surface_defect_detection/data/1.txt
  23. ```
  24. Download [dataset](https://github.com/abin24/Magnetic-tile-defect-datasets.) and the [label file](/examples/federated_learning/surface_defect_detection/data/2.txt) to `/data` of ```EDGE2_NODE```.
  25. ```
  26. mkdir -p /data
  27. cd /data
  28. git clone https://github.com/abin24/Magnetic-tile-defect-datasets..git Magnetic-tile-defect-datasets
  29. curl -o 2.txt https://raw.githubusercontent.com/kubeedge/sedna/main/examples/federated_learning/surface_defect_detection/data/2.txt
  30. ```
  31. ### Prepare Images
  32. This example uses these images:
  33. 1. aggregation worker: ```kubeedge/sedna-example-federated-learning-surface-defect-detection-aggregation:v0.3.0```
  34. 2. train worker: ```kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.3.0```
  35. These images are generated by the script [build_images.sh](/examples/build_image.sh).
  36. ### Create Federated Learning Job
  37. #### Create Dataset
  38. create dataset for `$EDGE1_NODE`
  39. ```
  40. kubectl create -f - <<EOF
  41. apiVersion: sedna.io/v1alpha1
  42. kind: Dataset
  43. metadata:
  44. name: "edge1-surface-defect-detection-dataset"
  45. spec:
  46. url: "/data/1.txt"
  47. format: "txt"
  48. nodeName: $EDGE1_NODE
  49. EOF
  50. ```
  51. create dataset for `$EDGE2_NODE`
  52. ```
  53. kubectl create -f - <<EOF
  54. apiVersion: sedna.io/v1alpha1
  55. kind: Dataset
  56. metadata:
  57. name: "edge2-surface-defect-detection-dataset"
  58. spec:
  59. url: "/data/2.txt"
  60. format: "txt"
  61. nodeName: $EDGE2_NODE
  62. EOF
  63. ```
  64. #### Create Model
  65. create the directory `/model` in the host of `$EDGE1_NODE`
  66. ```
  67. mkdir /model
  68. ```
  69. create the directory `/model` in the host of `$EDGE2_NODE`
  70. ```
  71. mkdir /model
  72. ```
  73. create model
  74. ```
  75. kubectl create -f - <<EOF
  76. apiVersion: sedna.io/v1alpha1
  77. kind: Model
  78. metadata:
  79. name: "surface-defect-detection-model"
  80. spec:
  81. url: "/model"
  82. format: "ckpt"
  83. EOF
  84. ```
  85. #### Start Federated Learning Job
  86. ```
  87. kubectl create -f - <<EOF
  88. apiVersion: sedna.io/v1alpha1
  89. kind: FederatedLearningJob
  90. metadata:
  91. name: surface-defect-detection
  92. spec:
  93. aggregationWorker:
  94. model:
  95. name: "surface-defect-detection-model"
  96. template:
  97. spec:
  98. nodeName: $CLOUD_NODE
  99. containers:
  100. - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-aggregation:v0.3.0
  101. name: agg-worker
  102. imagePullPolicy: IfNotPresent
  103. env: # user defined environments
  104. - name: "exit_round"
  105. value: "3"
  106. resources: # user defined resources
  107. limits:
  108. memory: 2Gi
  109. trainingWorkers:
  110. - dataset:
  111. name: "edge1-surface-defect-detection-dataset"
  112. template:
  113. spec:
  114. nodeName: $EDGE1_NODE
  115. containers:
  116. - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.3.0
  117. name: train-worker
  118. imagePullPolicy: IfNotPresent
  119. env: # user defined environments
  120. - name: "batch_size"
  121. value: "32"
  122. - name: "learning_rate"
  123. value: "0.001"
  124. - name: "epochs"
  125. value: "2"
  126. resources: # user defined resources
  127. limits:
  128. memory: 2Gi
  129. - dataset:
  130. name: "edge2-surface-defect-detection-dataset"
  131. template:
  132. spec:
  133. nodeName: $EDGE2_NODE
  134. containers:
  135. - image: kubeedge/sedna-example-federated-learning-surface-defect-detection-train:v0.3.0
  136. name: train-worker
  137. imagePullPolicy: IfNotPresent
  138. env: # user defined environments
  139. - name: "batch_size"
  140. value: "32"
  141. - name: "learning_rate"
  142. value: "0.001"
  143. - name: "epochs"
  144. value: "2"
  145. resources: # user defined resources
  146. limits:
  147. memory: 2Gi
  148. EOF
  149. ```
  150. ### Check Federated Learning Status
  151. ```
  152. kubectl get federatedlearningjob surface-defect-detection
  153. ```
  154. ### Check Federated Learning Train Result
  155. After the job completed, you will find the model generated on the directory `/model` in `$EDGE1_NODE` and `$EDGE2_NODE`.