|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481 |
-
- # Quick Start
-
- ## Introduction
-
- Sedna provide some examples of running Sedna jobs in [here](/examples/README.md)
-
- Here is a general guide to quick start an incremental learning job.
-
- ### Get Sedna
-
- You can find the latest Sedna release [here](https://github.com/kubeedge/sedna/releases).
-
- ### Deploying Sedna
-
- Sedna provides two deployment methods, which can be selected according to your actual situation:
-
- - Install Sedna on a cluster Step By Step: [guide here](setup/install.md).
- - Install Sedna AllinOne : [guide here](setup/local-up.md).
-
- ### Component
- Sedna consists of the following components:
-
- 
-
- #### GlobalManager
- * Unified edge-cloud synergy AI task management
- * Cross edge-cloud synergy management and collaboration
- * Central Configuration Management
-
- #### LocalController
- * Local process control of edge-cloud synergy AI tasks
- * Local general management: model, dataset, and status synchronization
-
-
- #### Worker
- * Do inference or training, based on existing ML framework.
- * Launch on demand, imagine they are docker containers.
- * Different workers for different features.
- * Could run on edge or cloud.
-
-
- #### Lib
- * Expose the Edge AI features to applications, i.e. training or inference programs.
-
-
- ### System Design
-
- There are three stages in a [incremental learning job](./proposals/incremental-learning.md): train/eval/deploy.
-
- 
-
- ## Deployment Guide
-
- ### 1. Prepare
-
- #### 1.1 Deployment Planning
-
- In this example, there is only one host with two nodes, which had creating a Kubernetes cluster with `kind`.
-
- | NAME | ROLES | Ip Address | Operating System | Host Configuration | Storage | Deployment Module |
- | ----- | ------- | ----------------------------- | ----------------------- | ------------------ | ------- | ------------------------------------------------------------ |
- | edge-node | agent,edge | 192.168.0.233 | Ubuntu 18.04.5 LTS | 8C16G | 500G | LC,lib, inference worker |
- | sedna-control-plane | control-plane,master | 172.18.0.2 | Ubuntu 20.10 | 8C16G | 500G | GM,LC,lib,training worker,evaluate worker |
-
- #### 1.2 Network Requirements
-
- In this example the node **sedna-control-plane** has a internal-ip `172.18.0.2`, and **edge-node** can access it.
-
- ### 2. Project Deployment
-
- #### 2.1 (optional) create virtual env
-
- ```bash
- python3.6 -m venv venv
- source venv/bin/activate
- pip3 install -U pip
- ```
-
- #### 2.2 install sedna SDK
-
- ```bash
- cd $SENDA_ROOT/lib
- python3.6 setup.py bdist_wheel
- pip3 install dist/sedna*.whl
- ```
-
- #### 2.3 Prepare your machine learning model and datasets
-
- ##### 2.3.1 Encapsulate an Estimators
-
- Sedna implements several pre-made Estimators in [example](/examples), your can find them from the python scripts called `interface`.
- Sedna supports the Estimators build from popular AI frameworks, such as TensorFlow, Pytorch, PaddlePaddle, MindSpore. Also Custom estimators can be used according to our interface document.
- All Estimators—pre-made or custom ones—are classes should encapsulate the following actions:
- - Training
- - Evaluation
- - Prediction
- - Export/load
-
- Follow [here](/lib/sedna/README.md) for more details, a [toy_example](/examples/incremental_learning/helmet_detection/training/interface.py) like:
-
-
- ```python
-
- os.environ['BACKEND_TYPE'] = 'TENSORFLOW'
-
- class Estimator:
-
- def __init__(self, **kwargs):
- ...
-
- def train(self, train_data, valid_data=None, **kwargs):
- ...
-
- def evaluate(self, data, **kwargs):
- ...
-
- def predict(self, data, **kwargs):
- ...
-
- def load(self, model_url, **kwargs):
- ...
-
- def save(self, model_path, **kwargs):
- ...
-
- def get_weights(self):
- ...
-
- def set_weights(self, weights):
- ...
- ```
-
- ##### 2.3.2 Dataset prepare
-
- In incremental_learning jobs, the following files will be indispensable:
-
- - base model: tensorflow object detection Fine-tuning a model from an existing checkpoint.
- - deploy model: tensorflow object detection model, for inference.
- - train data: images with label use for Fine-tuning model.
- - test data: video stream use for model inference.
-
- ```bash
- # download models, including base model and deploy model.
-
- cd /
- wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/models.tar.gz
- tar -zxvf models.tar.gz
-
- # download train data
-
- cd /data/helmet_detection # notes: files here will be monitored and used to trigger the incremental training
- wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
- tar -zxvf dataset.tar.gz
-
- # download test data
-
- cd /incremental_learning/video/
- wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz
- tar -zxvf video.tar.gz
-
- ```
-
- #### 2.3.3 Scripts prepare
-
- In incremental_learning jobs, the following scripts will be indispensable:
-
- - train.py: script for model fine-tuning/training.
- - eval.py: script for model evaluate.
- - inference.py: script for data inference.
-
- You can also find demos [here](/examples/incremental_learning/helmet_detection).
-
- Some interfaces should be learn in job pipeline:
-
- - `BaseConfig` provides the capability of obtaining the config from env
-
- ```python
-
- from sedna.common.config import BaseConfig
-
- train_dataset_url = BaseConfig.train_dataset_url
- model_url = BaseConfig.model_url
-
- ```
-
- - `Context` provides the capability of obtaining the context from CRD
-
- ```python
- from sedna.common.config import Context
-
- obj_threshold = Context.get_parameters("obj_threshold")
- nms_threshold = Context.get_parameters("nms_threshold")
- input_shape = Context.get_parameters("input_shape")
- epochs = Context.get_parameters('epochs')
- batch_size = Context.get_parameters('batch_size')
-
- ```
-
- - `datasources` base class, as that core feature of sedna require identifying the features and labels from data input, we specify that the first parameter for train/evaluate of the ML framework
-
- ```python
- from sedna.datasources import BaseDataSource
-
-
- train_data = BaseDataSource(data_type="train")
- train_data.x = []
- train_data.y = []
- for item in mnist_ds.create_dict_iterator():
- train_data.x.append(item["image"].asnumpy())
- train_data.y.append(item["label"].asnumpy())
- ```
-
- - `sedna.core` contain all edge-cloud features, Please note that each feature has its own parameters.
- - **Hard Example Mining Algorithms** in IncrementalLearning named `hard_example_mining`
-
- ```python
- from sedna.core.incremental_learning import IncrementalLearning
-
- hard_example_mining = IncrementalLearning.get_hem_algorithm_from_config(
- threshold_img=0.9
- )
-
- # initial an incremental instance
- incremental_instance = IncrementalLearning(
- estimator=Estimator,
- hard_example_mining=hem_dict
- )
-
- # Call the interface according to the job state
-
- # train.py
- incremental_instance.train(train_data=train_data, epochs=epochs,
- batch_size=batch_size,
- class_names=class_names,
- input_shape=input_shape,
- obj_threshold=obj_threshold,
- nms_threshold=nms_threshold)
-
- # inference
- results, _, is_hard_example = incremental_instance.inference(
- data, input_shape=input_shape)
-
-
- ```
-
-
- ### 3. Configuration
-
- ##### 3.1 Prepare Image
- This example uses the image:
- ```
- kubeedge/sedna-example-incremental-learning-helmet-detection:v0.4.0
- ```
-
- This image is generated by the script [build_images.sh](/examples/build_image.sh), used for creating training, eval and inference worker.
-
- ##### 3.2 Create Incremental Job
- In this example, `$WORKER_NODE` is a custom node, you can fill it which you actually run.
-
- ```
- WORKER_NODE="edge-node"
- ```
-
- - Create Dataset
-
- ```
- kubectl create -f - <<EOF
- apiVersion: sedna.io/v1alpha1
- kind: Dataset
- metadata:
- name: incremental-dataset
- spec:
- url: "/data/helmet_detection/train_data/train_data.txt"
- format: "txt"
- nodeName: $WORKER_NODE
- EOF
- ```
-
- - Create Initial Model to simulate the initial model in incremental learning scenario.
-
- ```
- kubectl create -f - <<EOF
- apiVersion: sedna.io/v1alpha1
- kind: Model
- metadata:
- name: initial-model
- spec:
- url : "/models/base_model"
- format: "ckpt"
- EOF
- ```
-
- - Create Deploy Model
-
- ```
- kubectl create -f - <<EOF
- apiVersion: sedna.io/v1alpha1
- kind: Model
- metadata:
- name: deploy-model
- spec:
- url : "/models/deploy_model/saved_model.pb"
- format: "pb"
- EOF
- ```
-
-
- ### 4. Run
-
- * incremental learning supports hot model updates and cold model updates. Job support
- cold model updates default. If you want to use hot model updates, please to add the following fields:
-
- ```yaml
- deploySpec:
- model:
- hotUpdateEnabled: true
- pollPeriodSeconds: 60 # default value is 60
- ```
-
- * create the job:
-
- ```
- IMAGE=kubeedge/sedna-example-incremental-learning-helmet-detection:v0.4.0
-
- kubectl create -f - <<EOF
- apiVersion: sedna.io/v1alpha1
- kind: IncrementalLearningJob
- metadata:
- name: helmet-detection-demo
- spec:
- initialModel:
- name: "initial-model"
- dataset:
- name: "incremental-dataset"
- trainProb: 0.8
- trainSpec:
- template:
- spec:
- nodeName: $WORKER_NODE
- containers:
- - image: $IMAGE
- name: train-worker
- imagePullPolicy: IfNotPresent
- args: ["train.py"]
- env:
- - name: "batch_size"
- value: "32"
- - name: "epochs"
- value: "1"
- - name: "input_shape"
- value: "352,640"
- - name: "class_names"
- value: "person,helmet,helmet-on,helmet-off"
- - name: "nms_threshold"
- value: "0.4"
- - name: "obj_threshold"
- value: "0.3"
- trigger:
- checkPeriodSeconds: 60
- timer:
- start: 02:00
- end: 20:00
- condition:
- operator: ">"
- threshold: 500
- metric: num_of_samples
- evalSpec:
- template:
- spec:
- nodeName: $WORKER_NODE
- containers:
- - image: $IMAGE
- name: eval-worker
- imagePullPolicy: IfNotPresent
- args: ["eval.py"]
- env:
- - name: "input_shape"
- value: "352,640"
- - name: "class_names"
- value: "person,helmet,helmet-on,helmet-off"
- deploySpec:
- model:
- name: "deploy-model"
- hotUpdateEnabled: true
- pollPeriodSeconds: 60
- trigger:
- condition:
- operator: ">"
- threshold: 0.1
- metric: precision_delta
- hardExampleMining:
- name: "IBT"
- parameters:
- - key: "threshold_img"
- value: "0.9"
- - key: "threshold_box"
- value: "0.9"
- template:
- spec:
- nodeName: $WORKER_NODE
- containers:
- - image: $IMAGE
- name: infer-worker
- imagePullPolicy: IfNotPresent
- args: ["inference.py"]
- env:
- - name: "input_shape"
- value: "352,640"
- - name: "video_url"
- value: "file://video/video.mp4"
- - name: "HE_SAVED_URL"
- value: "/he_saved_url"
- volumeMounts:
- - name: localvideo
- mountPath: /video/
- - name: hedir
- mountPath: /he_saved_url
- resources: # user defined resources
- limits:
- memory: 2Gi
- volumes: # user defined volumes
- - name: localvideo
- hostPath:
- path: /incremental_learning/video/
- type: DirectoryorCreate
- - name: hedir
- hostPath:
- path: /incremental_learning/he/
- type: DirectoryorCreate
- outputDir: "/output"
- EOF
- ```
-
- 1. The `Dataset` describes data with labels and `HE_SAVED_URL` indicates the address of the deploy container for uploading hard examples. Users will mark label for the hard examples in the address.
- 2. Ensure that the path of outputDir in the YAML file exists on your node. This path will be directly mounted to the container.
-
-
- ### 5. Monitor
-
- ### Check Incremental Learning Job
- Query the service status:
-
- ```
- kubectl get incrementallearningjob helmet-detection-demo
- ```
-
- In the `IncrementalLearningJob` resource helmet-detection-demo, the following trigger is configured:
-
- ```
- trigger:
- checkPeriodSeconds: 60
- timer:
- start: 02:00
- end: 20:00
- condition:
- operator: ">"
- threshold: 500
- metric: num_of_samples
- ```
-
- ## API
-
- - control-plane: Please refer to this [link](api/crd).
- - Lib: Please refer to this [link](api/lib).
-
- ## Contributing
-
- Contributions are very welcome!
-
- - control-plane: Please refer to this [link](contributing/control-plane/development.md).
- - Lib: Please refer to this [link](contributing/lib/development.md).
-
- ## Community
-
- Sedna is an open source project and in the spirit of openness and freedom, we welcome new contributors to join us.
- You can get in touch with the community according to the ways:
- * [Github Issues](https://github.com/kubeedge/sedna/issues)
- * [Regular Community Meeting](https://zoom.us/j/4167237304)
- * [slack channel](https://app.slack.com/client/TDZ5TGXQW/C01EG84REVB/details)
-
|