Browse Source

Merge pull request #143 from JoeyHwong-gk/lldoc

improve lifelong learning docs
tags/v0.3.1
KubeEdge Bot GitHub 4 years ago
parent
commit
e8c5ff3f8e
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 97 additions and 15 deletions
  1. +80
    -0
      build/crd-samples/sedna/lifelonglearningjobv1alpha1.yaml
  2. BIN
      docs/proposals/images/lifelong-learning-controller.png
  3. +3
    -3
      docs/proposals/lifelong-learning.md
  4. +14
    -12
      examples/lifelong_learning/atcii/README.md

+ 80
- 0
build/crd-samples/sedna/lifelonglearningjobv1alpha1.yaml View File

@@ -0,0 +1,80 @@
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: LifelongLearningJob
metadata:
name: atcii-classifier-demo
spec:
dataset:
name: "lifelong-dataset"
trainProb: 0.8
trainSpec:
template:
spec:
nodeName: "edge-node"
containers:
- image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
name: train-worker
imagePullPolicy: IfNotPresent
args: ["train.py"]
env:
- name: "early_stopping_rounds"
value: "100"
- name: "metric_name"
value: "mlogloss"
trigger:
checkPeriodSeconds: 60
timer:
start: 02:00
end: 24:00
condition:
operator: ">"
threshold: 500
metric: num_of_samples
evalSpec:
template:
spec:
nodeName: "edge-node"
containers:
- image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
name: eval-worker
imagePullPolicy: IfNotPresent
args: ["eval.py"]
env:
- name: "metrics"
value: "precision_score"
- name: "metric_param"
value: "{'average': 'micro'}"
- name: "model_threshold"
value: "0.5"
deploySpec:
template:
spec:
nodeName: "edge-node"
containers:
- image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
name: infer-worker
imagePullPolicy: IfNotPresent
args: ["inference.py"]
env:
- name: "UT_SAVED_URL"
value: "/ut_saved_url"
- name: "infer_dataset_url"
value: "/data/testData.csv"
volumeMounts:
- name: utdir
mountPath: /ut_saved_url
- name: inferdata
mountPath: /data/
resources:
limits:
memory: 2Gi
volumes:
- name: utdir
hostPath:
path: /lifelong/unseen_task/
type: DirectoryOrCreate
- name: inferdata
hostPath:
path: /data/
type: DirectoryOrCreate
outputDir: "/output"

BIN
docs/proposals/images/lifelong-learning-controller.png View File

Before After
Width: 1428  |  Height: 412  |  Size: 37 kB Width: 1429  |  Height: 413  |  Size: 33 kB

+ 3
- 3
docs/proposals/lifelong-learning.md View File

@@ -65,11 +65,11 @@ The tables below summarize the group, kind and API version details for the CRD.
|Kind | LifelongLearningJob |

### Lifelong learning CRD
See the [crd source](/build/crds/sedna/Lifelonglearningjob_v1alpha1.yaml) for details.
See the [crd source](/build/crds/sedna/sedna.io_lifelonglearningjobs.yaml) for details.

### Lifelong learning job type definition

See the [golang source](/pkg/apis/sedna/v1alpha1/Lifelongllearningjob_types.go) for details.
See the [golang source](/pkg/apis/sedna/v1alpha1/lifelonglearningjob_types.go) for details.

#### Validation
[Open API v3 Schema based validation](https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/#validation) can be used to guard against bad requests.
@@ -80,7 +80,7 @@ Here is a list of validations we need to support :
1. The edgenode name specified in the crd should exist in k8s.

### Lifelong learning job sample
See the [source](/build/crd-samples/sedna/Lifelonglearningjob_v1alpha1.yaml) for an example.
See the [source](/build/crd-samples/sedna/lifelonglearningjob_v1alpha1.yaml) for an example.
## Controller Design



+ 14
- 12
examples/lifelong_learning/atcii/README.md View File

@@ -1,8 +1,8 @@
# Using Lifelong Learning Job in Thermal Comfort Prediction Scenario

This document introduces how to use lifelong learning job in thermal comfort prediction scenario.
Using the lifelong learning job, our application can automatically retrains, evaluates,
and updates models based on the data generated at the edge.
Using the lifelong learning job, our application can automatically retrain, evaluate,
and update models based on the data generated at the edge.

## Thermal Comfort Prediction Experiment

@@ -16,15 +16,15 @@ In this example, you can use [ASHRAE Global Thermal Comfort Database II](https:/



download [datasets](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz), including train, evaluation and incremental dataset.
We provide a well-processed [datasets](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz), including train (`trainData.csv`), evaluation (`testData.csv`) and incremental (`trainData2.csv`) dataset.
```
cd /
cd /data
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz
tar -zxvf dataset.tar.gz
```

### Create Lifelong Job
in this example, `$WORKER_NODE` is a custom node, you can fill it which you actually run.
In this example, `$WORKER_NODE` is a custom node, you can fill it which you actually run.


```
@@ -39,13 +39,13 @@ kind: Dataset
metadata:
name: lifelong-dataset
spec:
url: "/data/lifelong_learning/trainData.csv"
url: "/data/trainData.csv"
format: "csv"
nodeName: $WORKER_NODE
EOF
```

Also, you can trigger retraining by use `incremental Dataset`[trainData2.csv] to replace `trainData.csv`
Also, you can replace `trainData.csv` with `trainData2.csv` which contained in `dataset` to trigger retraining.

Start The Lifelong Learning Job

@@ -68,8 +68,8 @@ spec:
- image: kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.3.0
name: train-worker
imagePullPolicy: IfNotPresent
args: ["train.py"]
env:
args: ["train.py"] # training script
env: # Hyperparameters required for training
- name: "early_stopping_rounds"
value: "100"
- name: "metric_name"
@@ -97,7 +97,7 @@ spec:
value: "precision_score"
- name: "metric_param"
value: "{'average': 'micro'}"
- name: "model_threshold"
- name: "model_threshold" # Threshold for filtering deploy models
value: "0.5"
deploySpec:
template:
@@ -109,9 +109,9 @@ spec:
imagePullPolicy: IfNotPresent
args: ["inference.py"]
env:
- name: "UT_SAVED_URL"
- name: "UT_SAVED_URL" # unseen tasks save path
value: "/ut_saved_url"
- name: "infer_dataset_url"
- name: "infer_dataset_url" # simulation of the inference samples
value: "/data/testData.csv"
volumeMounts:
- name: utdir
@@ -134,6 +134,8 @@ spec:
EOF
```

>**Note**: `outputDir` can be set as s3 storage url to save artifacts(model, sample, etc.) into s3, and follow [this](/examples/storage/s3/README.md) to set the credentials.

### Check Lifelong Learning Job
query the service status
```


Loading…
Cancel
Save