Browse Source

a tutorial based on atcii lifelong learning job

Signed-off-by: qxygxt <xingyu.q@outlook.com>
tags/v0.6.0
qxygxt 2 years ago
parent
commit
5d96245357
1 changed files with 440 additions and 0 deletions
  1. +440
    -0
      examples/lifelong_learning/atcii/lifelong-atcii-tutorial.md

+ 440
- 0
examples/lifelong_learning/atcii/lifelong-atcii-tutorial.md View File

@@ -0,0 +1,440 @@
This tutorial targets at [lifelong learning job in thermal comfort prediction scenario](https://github.com/kubeedge/sedna/blob/main/examples/lifelong_learning/atcii/README.md), and includes how to run the default example with customized configurations, as well as how to develop and integrate user-defined modules.
# 1 Configure Default Example
With Custom Resource Definitions (CRDs) of Kubernetes, developers are able to configure the default lifelong process using the following configurations.
## 1.1 Install Sedna
Follow the [Sedna installation document](https://sedna.readthedocs.io/en/v0.5.0/setup/install.html) to install Sedna.
## 1.2 Prepare Dataset
In the default example, [ASHRAE Global Thermal Comfort Database II (ATCII)](https://datadryad.org/stash/dataset/doi:10.6078/D1F671) is used to initiate lifelong learning job.

We provide a well-processed [dataset](https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz), including train (trainData.csv), evaluation (testData.csv) and incremental (trainData2.csv) dataset.

```
cd /data
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/atcii-classifier/dataset.tar.gz
tar -zxvf dataset.tar.gz
```
## 1.3 Create Dataset
After preparing specific dataset and index file, users can configure them as the following example for training and evaluation.

Data will be automatically downloaded from where the index file indicates to the corresponding pods.

| Property | Required | Description |
|----------|----------|-------------|
|name|yes|Dataset name defined in metadata|
|url|yes|Url of dataset index file, which is generally stored in data node|
|format|yes|Format of dataset index file|
|nodeName|yes|Name of data node that stores data and dataset index file|

```
DATA_NODE = "cloud-node"
```
```
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: lifelong-atcii-dataset
spec:
url: "/data/trainData.csv"
format: "csv"
nodeName: $DATA_NODE
EOF
```

## 1.4 Start Lifelong Learning Job
To run lifelong learning jobs, users need to configure their own lifelong learning CRDs in training, evaluation, and inference phases. The configuration process for these three phases is similar.

| Property | Required | Description |
|----------|----------|-------------|
|nodeName|yes|Name of the node where worker runs|
|dnsPolicy|yes|DNS policy set at pod level|
|imagePullPolicy|yes|Image pulling policy when local image does not exist|
|args|yes|Arguments to run images. In this example, it is the startup file of each stage|
|env|no|Environment variables passed to each stage |
|trigger|yes|Configuration for when training begins|
|resourcs|yes|Limited or required resources of CPU and memory|
|volumeMounts|no|Specified path to be mounted to the host|
|volumes|no|Directory in the node which file systems in a worker are mounted to|

```
TRAIN_NODE = "cloud-node"
EVAL_NODE = "cloud-node"
INFER_NODE = "edge-node"
CLOUD_IMAGE = kubeedge/sedna-example-lifelong-learning-atcii-classifier:v0.5.0
```
```
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: LifelongLearningJob
metadata:
name: atcii-classifier-demo
spec:
dataset:
name: "lifelong-atcii-dataset"
trainProb: 0.8
trainSpec:
template:
spec:
nodeName: $TRAIN_NODE
dnsPolicy: ClusterFirstWithHostNet
containers:
- image: $CLOUD_IMAGE
name: train-worker
imagePullPolicy: IfNotPresent
args: ["train.py"] # training script
env: # Hyperparameters required for training
- name: "early_stopping_rounds"
value: "100"
- name: "metric_name"
value: "mlogloss"
trigger:
checkPeriodSeconds: 60
timer:
start: 02:00
end: 24:00
condition:
operator: ">"
threshold: 500
metric: num_of_samples
evalSpec:
template:
spec:
nodeName: $EVAL_NODE
dnsPolicy: ClusterFirstWithHostNet
containers:
- image: $CLOUD_IMAGE
name: eval-worker
imagePullPolicy: IfNotPresent
args: ["eval.py"]
env:
- name: "metrics"
value: "precision_score"
- name: "metric_param"
value: "{'average': 'micro'}"
- name: "model_threshold" # Threshold for filtering deploy models
value: "0.5"
deploySpec:
template:
spec:
nodeName: $INFER_NODE
dnsPolicy: ClusterFirstWithHostNet
containers:
- image: $CLOUD_IMAGE
name: infer-worker
imagePullPolicy: IfNotPresent
args: ["inference.py"]
env:
- name: "UT_SAVED_URL" # unseen tasks save path
value: "/ut_saved_url"
- name: "infer_dataset_url" # simulation of the inference samples
value: "/data/testData.csv"
volumeMounts:
- name: utdir
mountPath: /ut_saved_url
- name: inferdata
mountPath: /data/
resources: # user defined resources
limits:
memory: 2Gi
volumes: # user defined volumes
- name: utdir
hostPath:
path: /lifelong/unseen_task/
type: DirectoryOrCreate
- name: inferdata
hostPath:
path: /data/
type: DirectoryOrCreate
outputDir: "/output"
EOF
```
## 1.5 Check Lifelong Learning Job
**(1). Query lifelong learning service status**

```
kubectl get lifelonglearningjob atcii-classifier-demo
```

**(2). View pods related to lifelong learning job**

```
kubectl get pod
```

**(3). Process unseen tasks samples**

In a real word, we need to label the hard examples in our unseen tasks which storage in `UT_SAVED_URL` with annotation tools and then put the examples to `Dataset`'s url. By this way, we can realize the function of updating models based on the data generated at the edge.

**(4). View result files**

Artifacts including multi-task learning models, partitioned sample sets, etc. can be found in `outputDir`, and the inference result is stored in the `Dataset`'s url.

# 2 Develop and Integrate Customized Modules

## 2.1 Before Development

Before you start development, you should prepare the [development environment](https://github.com/kubeedge/sedna/blob/main/docs/contributing/prepare-environment.md) and learn about the [interface design of Sedna](https://sedna.readthedocs.io/en/latest/autoapi/lib/sedna/index.html).

## 2.2 Develop Sedna AI Module

The Sedna framework components are decoupled and the registration mechanism is used to combine functional components to facilitate function and algorithm expansion. For details about the Sedna architecture and main mechanisms, see [Lib README](https://github.com/kubeedge/sedna/blob/51219027a0ec915bf3afb266dc5f9a7fb3880074/lib/sedna/README.md).

The following contents explains how to develop customized AI modules of a Sedna project, including **dataset**, **base model**, **algorithm**, etc.

### 2.2.1 Import Service Datasets

During Sedna application development, the first problem encountered is how to import service data sets to Sedna. Sedna provides interfaces and public methods related to data conversion and sampling in the [Dataset class](https://github.com/kubeedge/sedna/blob/c763c1a90e74b4ff1ab0afa06fb976fbb5efa512/lib/sedna/datasources/__init__.py).

All dataset classes of Sedna are inherited from the base class `sedna.datasources.BaseDataSource`. This base class defines the interfaces required by the dataset, provides attributes such as data_parse_func, save, and concat, and provides default implementation. The derived class can reload these default implementations as required.

We take `txt-format contain sets of images` as an example.

**(1). Inherite from BaseDataSource**

```python
class BaseDataSource:
"""
An abstract class representing a :class:`BaseDataSource`.

All datasets that represent a map from keys to data samples should subclass
it. All subclasses should overwrite parse`, supporting get train/eval/infer
data by a function. Subclasses could also optionally overwrite `__len__`,
which is expected to return the size of the dataset.overwrite `x` for the
feature-embedding, `y` for the target label.

Parameters
----------
data_type : str
define the datasource is train/eval/test
func: function
function use to parse an iter object batch by batch
"""

def __init__(self, data_type="train", func=None):
self.data_type = data_type # sample type: train/eval/test
self.process_func = None
if callable(func):
self.process_func = func
elif func:
self.process_func = ClassFactory.get_cls(
ClassType.CALLBACK, func)()
self.x = None # sample feature
self.y = None # sample label
self.meta_attr = None # special in lifelong learning

def num_examples(self) -> int:
return len(self.x)

def __len__(self):
return self.num_examples()

def parse(self, *args, **kwargs):
raise NotImplementedError

@property
def is_test_data(self):
return self.data_type == "test"

def save(self, output=""):
return FileOps.dump(self, output)
class TxtDataParse(BaseDataSource, ABC):
"""
txt file which contain image list parser
"""

def __init__(self, data_type, func=None):
super(TxtDataParse, self).__init__(data_type=data_type, func=func)

def parse(self, *args, **kwargs):
pass
```

**(2). Define dataset parse function**

```python
def parse(self, *args, **kwargs):
x_data = []
y_data = []
use_raw = kwargs.get("use_raw")
for f in args:
with open(f) as fin:
if self.process_func:
res = list(map(self.process_func, [
line.strip() for line in fin.readlines()]))
else:
res = [line.strip().split() for line in fin.readlines()]
for tup in res:
if not len(tup):
continue
if use_raw:
x_data.append(tup)
else:
x_data.append(tup[0])
if not self.is_test_data:
if len(tup) > 1:
y_data.append(tup[1])
else:
y_data.append(0)
self.x = np.array(x_data)
self.y = np.array(y_data)
```
**(3). Commission**

The preceding implementation can be directly used in the PipeStep in Sedna or independently invoked. The code for independently invoking is as follows:

```python
import os
import unittest


def _load_txt_dataset(dataset_url):
# use original dataset url,
# see https://github.com/kubeedge/sedna/issues/35
return os.path.abspath(dataset_url)


class TestDataset(unittest.TestCase):

def test_txtdata(self):
train_data = TxtDataParse(data_type="train", func=_load_txt_dataset)
train_data.parse(train_dataset_url, use_raw=True)
self.assertEqual(len(train_data), 1)


if __name__ == "__main__":
unittest.main()
```

### 2.2.2 Modify Base Model

Estimator is a high-level API that greatly simplifies machine learning programming. Estimators encapsulate training, evaluation, prediction, and exporting for your model.

**(1). Define an Estimator**

In lifelong learning ATCII case, Estimator is defined in [interface.py](https://github.com/kubeedge/sedna/blob/c763c1a90e74b4ff1ab0afa06fb976fbb5efa512/examples/lifelong_learning/atcii/interface.py), and users can replace the existing XGBoost model with the model that best suits their purpose.

```python
# XGBOOST
import os
import xgboost
os.environ['BACKEND_TYPE'] = 'SKLEARN'
XGBEstimator = xgboost.XGBClassifier(
learning_rate=0.1,
n_estimators=600,
max_depth=2,
min_child_weight=1,
gamma=0,
subsample=0.8,
colsample_bytree=0.8,
objective="multi:softmax",
num_class=3,
nthread=4,
seed=27
)
```

```python
# Customize
class Estimator:

def __init__(self, **kwargs):
...
def load(self, model_url=""):
...
def save(self, model_path=None):
...

def predict(self, data, **kwargs):
...
def evaluate(self, valid_data, **kwargs):
...
def train(self, train_data, valid_data=None, **kwargs):
...
```

**(2). Initialize a lifelong learning job**

```python
ll_job = LifelongLearning(
estimator=Estimator,
task_definition=task_definition,
)
```

Noted that `Estimator` is the base model for your lifelong learning job.

### 2.2.3 Develop Customized Algorithms

Users may need to develop new algorithms based on the basic classes provided by Sedna, such as `unseen task detect` in lifelong learning example.

Sedna provides a class called `class_factory.py` in `common` package, in which only a few lines of changes are required to integrate existing algorithms into Sedna.

The following content takes a hard example mining algorithm as an example to explain how to add an HEM algorithm to the Sedna hard example mining algorithm library.

**(1). Start from the `class_factory.py`**

First, let's start from the `class_factory.py`. Two classes are defined in `class_factory.py`, namely `ClassType` and `ClassFactory`.

`ClassFactory` can register the modules you want to reuse through decorators. For the new `ClassType.HEM` algorithm, the code is as follows:

```python

@ClassFactory.register(ClassType.HEM, alias="Threshold")
class ThresholdFilter(BaseFilter, abc.ABC):
def __init__(self, threshold=0.5, **kwargs):
self.threshold = float(threshold)

def __call__(self, infer_result=None):
# if invalid input, return False
if not (infer_result
and all(map(lambda x: len(x) > 4, infer_result))):
return False

image_score = 0

for bbox in infer_result:
image_score += bbox[4]

average_score = image_score / (len(infer_result) or 1)
return average_score < self.threshold
```

In this step, you have customized an **hard_example_mining algorithm** named `Threshold`, and the line of `ClassFactory.register(ClassType.HEM)` is to complete the registration.

**(2). Configure CRD yaml**

After registration, you only need to change the name of the hem and parameters in the yaml file, and then the corresponding class will be automatically called according to the name.

```yaml
deploySpec:
hardExampleMining:
name: "Threshold"
parameters:
- key: "threshold"
value: "0.9"
```

## 2.3 Run Customized Example

**(1). Build worker images**

First, you need to modify [lifelong-learning-atcii-classifier.Dockerfile](https://github.com/kubeedge/sedna/blob/c763c1a90e74b4ff1ab0afa06fb976fbb5efa512/examples/lifelong-learning-atcii-classifier.Dockerfile) based on your development.

Then generate Images by the script [build_images.sh](https://github.com/kubeedge/sedna/blob/main/examples/build_image.sh).

**(2). Start customized lifelong job**

This process is similar to the process described in section `1.4`, but remember to modify the dataset (explained in `1.3`) and `CLOUD_IMAGE` to match your development.

## 2.4 Further Development

In addition to developing on the lifelong learning case, users can also [develop the control plane](https://github.com/kubeedge/sedna/blob/main/docs/contributing/control-plane/development.md) of the Sedna project, as well as [adding a new synergy feature](https://github.com/kubeedge/sedna/blob/51219027a0ec915bf3afb266dc5f9a7fb3880074/docs/contributing/control-plane/add-a-new-synergy-feature.md).

Loading…
Cancel
Save