Sedna provide some examples of running Sedna jobs in here
Here is a general guide to quick start an incremental learning job.
You can find the latest Sedna release here.
Sedna provides two deployment methods, which can be selected according to your actual situation:
Sedna consists of the following components:
There are three stages in a incremental learning job: train/eval/deploy.
In this example, there is only one host with two nodes, which had creating a Kubernetes cluster with kind.
| NAME | ROLES | Ip Address | Operating System | Host Configuration | Storage | Deployment Module |
|---|---|---|---|---|---|---|
| edge-node | agent,edge | 192.168.0.233 | Ubuntu 18.04.5 LTS | 8C16G | 500G | LC,lib, inference worker |
| sedna-control-plane | control-plane,master | 172.18.0.2 | Ubuntu 20.10 | 8C16G | 500G | GM,LC,lib,training worker,evaluate worker |
In this example the node sedna-control-plane has a internal-ip 172.18.0.2, and edge-node can access it.
python3.6 -m venv venv
source venv/bin/activate
pip3 install -U pip
cd $SENDA_ROOT/lib
python3.6 setup.py bdist_wheel
pip3 install dist/sedna*.whl
Sedna implements several pre-made Estimators in example, your can find them from the python scripts called interface.
Sedna supports the Estimators build from popular AI frameworks, such as TensorFlow, Pytorch, PaddlePaddle, MindSpore. Also Custom estimators can be used according to our interface document.
All Estimators—pre-made or custom ones—are classes should encapsulate the following actions:
Follow here for more details, a toy_example like:
os.environ['BACKEND_TYPE'] = 'TENSORFLOW'
class Estimator:
def __init__(self, **kwargs):
...
def train(self, train_data, valid_data=None, **kwargs):
...
def evaluate(self, data, **kwargs):
...
def predict(self, data, **kwargs):
...
def load(self, model_url, **kwargs):
...
def save(self, model_path, **kwargs):
...
def get_weights(self):
...
def set_weights(self, weights):
...
In incremental_learning jobs, the following files will be indispensable:
# download models, including base model and deploy model.
cd /
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/models.tar.gz
tar -zxvf models.tar.gz
# download train data
cd /data/helmet_detection # notes: files here will be monitored and used to trigger the incremental training
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
tar -zxvf dataset.tar.gz
# download test data
cd /incremental_learning/video/
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz
tar -zxvf video.tar.gz
In incremental_learning jobs, the following scripts will be indispensable:
You can also find demos here.
Some interfaces should be learn in job pipeline:
BaseConfig provides the capability of obtaining the config from env
from sedna.common.config import BaseConfig
train_dataset_url = BaseConfig.train_dataset_url
model_url = BaseConfig.model_url
Context provides the capability of obtaining the context from CRDfrom sedna.common.config import Context
obj_threshold = Context.get_parameters("obj_threshold")
nms_threshold = Context.get_parameters("nms_threshold")
input_shape = Context.get_parameters("input_shape")
epochs = Context.get_parameters('epochs')
batch_size = Context.get_parameters('batch_size')
datasources base class, as that core feature of sedna require identifying the features and labels from data input, we specify that the first parameter for train/evaluate of the ML frameworkfrom sedna.datasources import BaseDataSource
train_data = BaseDataSource(data_type="train")
train_data.x = []
train_data.y = []
for item in mnist_ds.create_dict_iterator():
train_data.x.append(item["image"].asnumpy())
train_data.y.append(item["label"].asnumpy())
sedna.core contain all edge-cloud features, Please note that each feature has its own parameters.hard_example_miningfrom sedna.core.incremental_learning import IncrementalLearning
hard_example_mining = IncrementalLearning.get_hem_algorithm_from_config(
threshold_img=0.9
)
# initial an incremental instance
incremental_instance = IncrementalLearning(
estimator=Estimator,
hard_example_mining=hem_dict
)
# Call the interface according to the job state
# train.py
incremental_instance.train(train_data=train_data, epochs=epochs,
batch_size=batch_size,
class_names=class_names,
input_shape=input_shape,
obj_threshold=obj_threshold,
nms_threshold=nms_threshold)
# inference
results, _, is_hard_example = incremental_instance.inference(
data, input_shape=input_shape)
This example uses the image:
kubeedge/sedna-example-incremental-learning-helmet-detection:v0.4.0
This image is generated by the script build_images.sh, used for creating training, eval and inference worker.
In this example, $WORKER_NODE is a custom node, you can fill it which you actually run.
WORKER_NODE="edge-node"
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: incremental-dataset
spec:
url: "/data/helmet_detection/train_data/train_data.txt"
format: "txt"
nodeName: $WORKER_NODE
EOF
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
name: initial-model
spec:
url : "/models/base_model"
format: "ckpt"
EOF
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
name: deploy-model
spec:
url : "/models/deploy_model/saved_model.pb"
format: "pb"
EOF
deploySpec:
model:
hotUpdateEnabled: true
pollPeriodSeconds: 60 # default value is 60
IMAGE=kubeedge/sedna-example-incremental-learning-helmet-detection:v0.4.0
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: IncrementalLearningJob
metadata:
name: helmet-detection-demo
spec:
initialModel:
name: "initial-model"
dataset:
name: "incremental-dataset"
trainProb: 0.8
trainSpec:
template:
spec:
nodeName: $WORKER_NODE
containers:
- image: $IMAGE
name: train-worker
imagePullPolicy: IfNotPresent
args: ["train.py"]
env:
- name: "batch_size"
value: "32"
- name: "epochs"
value: "1"
- name: "input_shape"
value: "352,640"
- name: "class_names"
value: "person,helmet,helmet-on,helmet-off"
- name: "nms_threshold"
value: "0.4"
- name: "obj_threshold"
value: "0.3"
trigger:
checkPeriodSeconds: 60
timer:
start: 02:00
end: 20:00
condition:
operator: ">"
threshold: 500
metric: num_of_samples
evalSpec:
template:
spec:
nodeName: $WORKER_NODE
containers:
- image: $IMAGE
name: eval-worker
imagePullPolicy: IfNotPresent
args: ["eval.py"]
env:
- name: "input_shape"
value: "352,640"
- name: "class_names"
value: "person,helmet,helmet-on,helmet-off"
deploySpec:
model:
name: "deploy-model"
hotUpdateEnabled: true
pollPeriodSeconds: 60
trigger:
condition:
operator: ">"
threshold: 0.1
metric: precision_delta
hardExampleMining:
name: "IBT"
parameters:
- key: "threshold_img"
value: "0.9"
- key: "threshold_box"
value: "0.9"
template:
spec:
nodeName: $WORKER_NODE
containers:
- image: $IMAGE
name: infer-worker
imagePullPolicy: IfNotPresent
args: ["inference.py"]
env:
- name: "input_shape"
value: "352,640"
- name: "video_url"
value: "file://video/video.mp4"
- name: "HE_SAVED_URL"
value: "/he_saved_url"
volumeMounts:
- name: localvideo
mountPath: /video/
- name: hedir
mountPath: /he_saved_url
resources: # user defined resources
limits:
memory: 2Gi
volumes: # user defined volumes
- name: localvideo
hostPath:
path: /incremental_learning/video/
type: DirectoryorCreate
- name: hedir
hostPath:
path: /incremental_learning/he/
type: DirectoryorCreate
outputDir: "/output"
EOF
Dataset describes data with labels and HE_SAVED_URL indicates the address of the deploy container for uploading hard examples. Users will mark label for the hard examples in the address.Query the service status:
kubectl get incrementallearningjob helmet-detection-demo
In the IncrementalLearningJob resource helmet-detection-demo, the following trigger is configured:
trigger:
checkPeriodSeconds: 60
timer:
start: 02:00
end: 20:00
condition:
operator: ">"
threshold: 500
metric: num_of_samples
Contributions are very welcome!
Sedna is an open source project and in the spirit of openness and freedom, we welcome new contributors to join us.
You can get in touch with the community according to the ways: