You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 4.3 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192
  1. # Using Incremental Learning Job In Mnist
  2. This document introduces how to use the inference of incremental learning job in Mnist. Using the incremental learning inference job, our application can automatically retrains, evaluates, and updates models based on the data generated at the edge.
  3. ## Mnist Experiment
  4. ### Prepare Model
  5. ```
  6. Link:https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA
  7. Extract code:OSPP
  8. ```
  9. ### Prepare dataset
  10. ```
  11. Link:https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA
  12. Extract code:OSPP
  13. ```
  14. ### Prepare Image
  15. This example uses the image:
  16. ```
  17. ymh13383894400/mnist-new:v1
  18. ```
  19. This image is generated by the script used for creating training, eval and inference worker.
  20. ### Project creation and running
  21. Create a Mnist project
  22. ```
  23. ├─flowunit:# Flowunit directory
  24. │ ├─mnist_preprocess:# Preprocessing functional unit
  25. │ ├─mnist_infer:# TensorFlow Inference Functional Unit
  26. │ ├─mnist_response:# HTTP responses construct functional units
  27. └─graph:# Flowchart catalog
  28. │ ├─mnist.toml:# Inference flowchart
  29. │ └─test_mnist.py # Inference python file
  30. ```
  31. create the job
  32. ```shell
  33. WORKER_NODE="edge-node1"
  34. INFER_NODE="edge-node2"
  35. ```
  36. - Create Dataset
  37. ```yaml
  38. kubectl create -f - <<EOF
  39. apiVersion: sedna.io/v1alpha1
  40. kind: Dataset
  41. metadata:
  42. name: incremental-dataset
  43. spec:
  44. url: "/data/train_data.txt"
  45. format: "txt"
  46. nodeName: $WORKER_NODE
  47. EOF
  48. ```
  49. - Create Initial Model to simulate the initial model in incremental learning scenario.
  50. ```yaml
  51. kubectl create -f - <<EOF
  52. apiVersion: sedna.io/v1alpha1
  53. kind: Model
  54. metadata:
  55. name: initial-model
  56. spec:
  57. url : "/models/base_model"
  58. format: "ckpt"
  59. EOF
  60. ```
  61. - Create Deploy Model
  62. ```yaml
  63. kubectl create -f - <<EOF
  64. apiVersion: sedna.io/v1alpha1
  65. kind: Model
  66. metadata:
  67. name: deploy-model
  68. spec:
  69. url : "/models/deploy_model/saved_model.pb"
  70. format: "pb"
  71. EOF
  72. ```
  73. - Start The Incremental Learning Job
  74. The inference part uses the modelbox image to run the pod.
  75. ```yaml
  76. IMAGE=ymh13383894400/mnist-new:v1
  77. kubectl create -f - <<EOF
  78. apiVersion: sedna.io/v1alpha1
  79. kind: IncrementalLearningJob
  80. metadata:
  81. name: Mnist-demo
  82. spec:
  83. initialModel:
  84. name: "initial-model"
  85. dataset:
  86. name: "incremental-dataset"
  87. trainProb: 0.8
  88. trainSpec:
  89. template:
  90. spec:
  91. nodeName: $WORKER_NODE
  92. containers:
  93. - image: $IMAGE
  94. name: train-worker
  95. imagePullPolicy: IfNotPresent
  96. args: ["train.py"]
  97. trigger:
  98. checkPeriodSeconds: 60
  99. timer:
  100. start: 02:00
  101. end: 20:00
  102. condition:
  103. operator: ">"
  104. threshold: 500
  105. metric: num_of_samples
  106. evalSpec:
  107. template:
  108. spec:
  109. nodeName: $WORKER_NODE
  110. containers:
  111. - image: $IMAGE
  112. name: eval-worker
  113. imagePullPolicy: IfNotPresent
  114. args: ["eval.py"]
  115. deploySpec:
  116. model:
  117. name: "deploy-model"
  118. hotUpdateEnabled: true
  119. pollPeriodSeconds: 60
  120. trigger:
  121. condition:
  122. operator: ">"
  123. threshold: 0.1
  124. metric: precision_delta
  125. hardExampleMining:
  126. name: "IBT"
  127. parameters:
  128. - key: "threshold_img"
  129. value: "0.9"
  130. - key: "threshold_box"
  131. value: "0.9"
  132. template:
  133. spec:
  134. nodeName: $INFER_NODE
  135. containers:
  136. - image: $IMAGE
  137. name: infer-worker
  138. imagePullPolicy: IfNotPresent
  139. args: ["test_mnist.py"]
  140. volumeMounts:
  141. - name: localvideo
  142. mountPath: /video/
  143. - name: hedir
  144. mountPath: /he_saved_url
  145. resources: # user defined resources
  146. limits:
  147. memory: 2Gi
  148. volumes: # user defined volumes
  149. - name: localvideo
  150. hostPath:
  151. path: /incremental_learning/video/
  152. type: DirectoryOrCreate
  153. - name: hedir
  154. hostPath:
  155. path: /incremental_learning/he/
  156. type: DirectoryOrCreate
  157. outputDir: "/output"
  158. EOF
  159. ```
  160. ### Check Incremental Learning Job
  161. Query the service status:
  162. ```shell
  163. kubectl get incrementallearningjob Mnist-detection-demo
  164. ```