You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 6.4 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213
  1. # Using Incremental Learning Job in Helmet Detection Scenario on S3
  2. This example is based on the example: [Using Incremental Learning Job in Helmet Detection Scenario](/examples/incremental_learning/helmet_detection/README.md).
  3. ### Prepare Nodes
  4. Assume you have created a [KubeEdge](https://github.com/kubeedge/kubeedge) cluster that have two cloud nodes(e.g., `cloud-node1`, `cloud-node2`)
  5. and one edge node(e.g., `edge-node`).
  6. ### Create a secret with your S3 user credential.
  7. ```shell
  8. kubectl create -f - <<EOF
  9. apiVersion: v1
  10. kind: Secret
  11. metadata:
  12. name: mysecret
  13. annotations:
  14. s3-endpoint: s3.amazonaws.com # replace with your s3 endpoint e.g minio-service.kubeflow:9000
  15. s3-usehttps: "1" # by default 1, if testing with minio you can set to 0
  16. stringData: # use `stringData` for raw credential string or `data` for base64 encoded string
  17. ACCESS_KEY_ID: XXXX
  18. SECRET_ACCESS_KEY: XXXXXXXX
  19. EOF
  20. ```
  21. ### Prepare Model
  22. * Download [models](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/model.tar.gz).
  23. * Put the unzipped model file into the bucket of your cloud storage service.
  24. * Attach the created secret to the Model and create Model.
  25. ```shell
  26. kubectl create -f - <<EOF
  27. apiVersion: sedna.io/v1alpha1
  28. kind: Model
  29. metadata:
  30. name: initial-model
  31. spec:
  32. url : "s3://kubeedge/model/base_model"
  33. format: "ckpt"
  34. credentialName: mysecret
  35. EOF
  36. ```
  37. ```shell
  38. kubectl $action -f - <<EOF
  39. apiVersion: sedna.io/v1alpha1
  40. kind: Model
  41. metadata:
  42. name: deploy-model
  43. spec:
  44. url: "s3://kubeedge/model/deploy_model/saved_model.pb"
  45. format: "pb"
  46. credentialName: mysecret
  47. EOF
  48. ```
  49. ### Prepare Dataset
  50. * Download [dataset](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz).
  51. * Put the unzipped dataset file into the bucket of your cloud storage service.
  52. * Attach the created secret to Dataset and create Dataset.
  53. ```shell
  54. kubectl $action -f - <<EOF
  55. apiVersion: sedna.io/v1alpha1
  56. kind: Dataset
  57. metadata:
  58. name: incremental-dataset
  59. spec:
  60. url: "s3://kubeedge/data/helmet_detection/train_data/train_data.txt"
  61. format: "txt"
  62. nodeName: cloud-node1
  63. credentialName: mysecret
  64. EOF
  65. ```
  66. ### Prepare Image
  67. This example uses the image:
  68. ```shell
  69. kubeedge/sedna-example-incremental-learning-helmet-detection:v0.3.1
  70. ```
  71. This image is generated by the script [build_images.sh](/examples/build_image.sh), used for creating training, eval and inference worker.
  72. ### Prepare Job
  73. * Inference/Train/Eval worker now can be deployed by nodeName and [nodeSelector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) on multiple nodes.
  74. * Make sure to follow the local dir which exists in edge side.
  75. ```shell
  76. mkdir -p /incremental_learning/video/
  77. ```
  78. * Download [video](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz), unzip video.tar.gz, and put it into `/incremental_learning/video/`.
  79. ```
  80. cd /incremental_learning/video/
  81. wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz
  82. tar -zxvf video.tar.gz
  83. ```
  84. * Attach the created secret to the Job and create Job.
  85. ```shell
  86. IMAGE=kubeedge/sedna-example-incremental-learning-helmet-detection:v0.3.1
  87. kubectl create -f - <<EOF
  88. apiVersion: sedna.io/v1alpha1
  89. kind: IncrementalLearningJob
  90. metadata:
  91. name: helmet-detection-demo
  92. spec:
  93. initialModel:
  94. name: "initial-model"
  95. dataset:
  96. name: "incremental-dataset"
  97. trainProb: 0.8
  98. trainSpec:
  99. template:
  100. spec:
  101. nodeName: cloud-node1
  102. containers:
  103. - image: $IMAGE
  104. name: train-worker
  105. imagePullPolicy: IfNotPresent
  106. args: ["train.py"]
  107. env:
  108. - name: "batch_size"
  109. value: "32"
  110. - name: "epochs"
  111. value: "1"
  112. - name: "input_shape"
  113. value: "352,640"
  114. - name: "class_names"
  115. value: "person,helmet,helmet-on,helmet-off"
  116. - name: "nms_threshold"
  117. value: "0.4"
  118. - name: "obj_threshold"
  119. value: "0.3"
  120. trigger:
  121. checkPeriodSeconds: 60
  122. timer:
  123. start: 02:00
  124. end: 20:00
  125. condition:
  126. operator: ">"
  127. threshold: 500
  128. metric: num_of_samples
  129. evalSpec:
  130. template:
  131. spec:
  132. nodeName: cloud-node2
  133. containers:
  134. - image: $IMAGE
  135. name: eval-worker
  136. imagePullPolicy: IfNotPresent
  137. args: ["eval.py"]
  138. env:
  139. - name: "input_shape"
  140. value: "352,640"
  141. - name: "class_names"
  142. value: "person,helmet,helmet-on,helmet-off"
  143. deploySpec:
  144. model:
  145. name: "deploy-model"
  146. trigger:
  147. condition:
  148. operator: ">"
  149. threshold: 0.1
  150. metric: precision_delta
  151. hardExampleMining:
  152. name: "IBT"
  153. parameters:
  154. - key: "threshold_img"
  155. value: "0.9"
  156. - key: "threshold_box"
  157. value: "0.9"
  158. template:
  159. spec:
  160. nodeName: edge-node
  161. containers:
  162. - image: $IMAGE
  163. name: infer-worker
  164. imagePullPolicy: IfNotPresent
  165. args: ["inference.py"]
  166. env:
  167. - name: "input_shape"
  168. value: "352,640"
  169. - name: "video_url"
  170. value: "file://video/video.mp4"
  171. - name: "HE_SAVED_URL"
  172. value: "/he_saved_url"
  173. volumeMounts:
  174. - name: localvideo
  175. mountPath: /video/
  176. - name: hedir
  177. mountPath: /he_saved_url
  178. resources: # user defined resources
  179. limits:
  180. memory: 2Gi
  181. volumes: # user defined volumes
  182. - name: localvideo
  183. hostPath:
  184. path: /incremental_learning/video/
  185. type: DirectoryOrCreate
  186. - name: hedir
  187. hostPath:
  188. path: /incremental_learning/he/
  189. type: DirectoryOrCreate
  190. credentialName: mysecret
  191. outputDir: "s3://kubeedge/incremental_learning/output"
  192. EOF
  193. ```