You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 5.0 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165
  1. # Using Lifelong Learning Job in Thermal Comfort Prediction Scenario
  2. This document introduces how to use lifelong learning job in thermal comfort prediction scenario.
  3. Using the lifelong learning job, our application can automatically retrain, evaluate,
  4. and update models based on the data generated at the edge.
  5. ## Thermal Comfort Prediction Experiment
  6. ### Install Sedna
  7. Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna.
  8. ### Prepare Dataset
  9. In this example, you can use [ASHRAE Global Thermal Comfort Database II](https://datadryad.org/stash/dataset/doi:10.6078/D1F671) to initial lifelong learning job.
  10. We provide a well-processed [datasets](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz), including train (`trainData.csv`), evaluation (`testData.csv`) and incremental (`trainData2.csv`) dataset.
  11. ```
  12. cd /data
  13. wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz
  14. tar -zxvf dataset.tar.gz
  15. ```
  16. ### Create Lifelong Job
  17. In this example, `$WORKER_NODE` is a custom node, you can fill it which you actually run.
  18. ```
  19. WORKER_NODE="edge-node"
  20. ```
  21. Create Dataset
  22. ```
  23. kubectl create -f - <<EOF
  24. apiVersion: sedna.io/v1alpha1
  25. kind: Dataset
  26. metadata:
  27. name: lifelong-dataset
  28. spec:
  29. url: "/data/trainData.csv"
  30. format: "csv"
  31. nodeName: $WORKER_NODE
  32. EOF
  33. ```
  34. Also, you can replace `trainData.csv` with `trainData2.csv` which contained in `dataset` to trigger retraining.
  35. Start The Lifelong Learning Job
  36. ```
  37. kubectl create -f - <<EOF
  38. apiVersion: sedna.io/v1alpha1
  39. kind: LifelongLearningJob
  40. metadata:
  41. name: atcii-classifier-demo
  42. spec:
  43. dataset:
  44. name: "lifelong-dataset"
  45. trainProb: 0.8
  46. trainSpec:
  47. template:
  48. spec:
  49. nodeName: $WORKER_NODE
  50. containers:
  51. - image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
  52. name: train-worker
  53. imagePullPolicy: IfNotPresent
  54. args: ["train.py"] # training script
  55. env: # Hyperparameters required for training
  56. - name: "early_stopping_rounds"
  57. value: "100"
  58. - name: "metric_name"
  59. value: "mlogloss"
  60. trigger:
  61. checkPeriodSeconds: 60
  62. timer:
  63. start: 02:00
  64. end: 24:00
  65. condition:
  66. operator: ">"
  67. threshold: 500
  68. metric: num_of_samples
  69. evalSpec:
  70. template:
  71. spec:
  72. nodeName: $WORKER_NODE
  73. containers:
  74. - image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
  75. name: eval-worker
  76. imagePullPolicy: IfNotPresent
  77. args: ["eval.py"]
  78. env:
  79. - name: "metrics"
  80. value: "precision_score"
  81. - name: "metric_param"
  82. value: "{'average': 'micro'}"
  83. - name: "model_threshold" # Threshold for filtering deploy models
  84. value: "0.5"
  85. deploySpec:
  86. template:
  87. spec:
  88. nodeName: $WORKER_NODE
  89. containers:
  90. - image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
  91. name: infer-worker
  92. imagePullPolicy: IfNotPresent
  93. args: ["inference.py"]
  94. env:
  95. - name: "UT_SAVED_URL" # unseen tasks save path
  96. value: "/ut_saved_url"
  97. - name: "infer_dataset_url" # simulation of the inference samples
  98. value: "/data/testData.csv"
  99. volumeMounts:
  100. - name: utdir
  101. mountPath: /ut_saved_url
  102. - name: inferdata
  103. mountPath: /data/
  104. resources: # user defined resources
  105. limits:
  106. memory: 2Gi
  107. volumes: # user defined volumes
  108. - name: utdir
  109. hostPath:
  110. path: /lifelong/unseen_task/
  111. type: DirectoryOrCreate
  112. - name: inferdata
  113. hostPath:
  114. path: /data/
  115. type: DirectoryOrCreate
  116. outputDir: "/output"
  117. EOF
  118. ```
  119. >**Note**: `outputDir` can be set as s3 storage url to save artifacts(model, sample, etc.) into s3, and follow [this](/examples/storage/s3/README.md) to set the credentials.
  120. ### Check Lifelong Learning Job
  121. query the service status
  122. ```
  123. kubectl get lifelonglearningjob atcii-classifier-demo
  124. ```
  125. In the `lifelonglearningjob` resource atcii-classifier-demo, the following trigger is configured:
  126. ```
  127. trigger:
  128. checkPeriodSeconds: 60
  129. timer:
  130. start: 02:00
  131. end: 20:00
  132. condition:
  133. operator: ">"
  134. threshold: 500
  135. metric: num_of_samples
  136. ```
  137. ### Unseen Tasks samples Labeling
  138. In a real word, we need to label the hard examples in our unseen tasks which storage in `UT_SAVED_URL` with annotation tools and then put the examples to `Dataset`'s url.
  139. ### Effect Display
  140. In this example, **false** and **failed** detections occur at stage of inference before lifelong learning.
  141. After lifelong learning, the precision of the dataset have been improved by 5.12%.
  142. ![img_1.png](image/effect_comparison.png)