|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350 |
- * [Dataset and Model](#dataset-and-model)
- * [Motivation](#motivation)
- * [Goals](#goals)
- * [Non\-goals](#non-goals)
- * [Proposal](#proposal)
- * [Use Cases](#use-cases)
- * [Design Details](#design-details)
- * [CRD API Group and Version](#crd-api-group-and-version)
- * [CRDs](#crds)
- * [Type definition](#crd-type-definition)
- * [Crd sample](#crd-samples)
- * [Controller Design](#controller-design)
-
- # Dataset and Model
-
- ## Motivation
-
- Currently, the Edge AI features depend on the object `dataset` and `model`.
-
-
- This proposal provides the definitions of dataset and model as the first class of k8s resources.
-
- ### Goals
-
- * Metadata of `dataset` and `model` objects.
- * Used by the Edge AI features
-
- ### Non-goals
- * The truly format of the AI `dataset`, such as `imagenet`, `coco` or `tf-record` etc.
- * The truly format of the AI `model`, such as `ckpt`, `saved_model` of tensorflow etc.
-
- * The truly operations of the AI `dataset`, such as `shuffle`, `crop` etc.
- * The truly operations of the AI `model`, such as `train`, `inference` etc.
-
-
- ## Proposal
- We propose using Kubernetes Custom Resource Definitions (CRDs) to describe
- the dataset/model specification/status and a controller to synchronize these updates between edge and cloud.
-
- 
-
- ### Use Cases
-
- * Users can create the dataset resource, by providing the `dataset url`, `format` and the `nodeName` which owns the dataset.
- * Users can create the model resource by providing the `model url` and `format`.
- * Users can show the information of dataset/model.
- * Users can delete the dataset/model.
-
-
- ## Design Details
-
- ### CRD API Group and Version
- The `Dataset` and `Model` CRDs will be namespace-scoped.
- The tables below summarize the group, kind and API version details for the CRDs.
-
- * Dataset
-
- | Field | Description |
- |-----------------------|-------------------------|
- |Group | neptune.io |
- |APIVersion | v1alpha1 |
- |Kind | Dataset |
-
- * Model
-
- | Field | Description |
- |-----------------------|-------------------------|
- |Group | neptune.io |
- |APIVersion | v1alpha1 |
- |Kind | Model |
-
- ### CRDs
-
- #### `Dataset` CRD
-
- [crd source](/build/crds/neptune/dataset_v1alpha1.yaml)
-
- ```yaml
- apiVersion: apiextensions.k8s.io/v1
- kind: CustomResourceDefinition
- metadata:
- name: datasets.neptune.io
- spec:
- group: neptune.io
- names:
- kind: Dataset
- plural: datasets
- scope: Namespaced
- versions:
- - name: v1alpha1
- subresources:
- # status enables the status subresource.
- status: {}
- served: true
- storage: true
- schema:
- openAPIV3Schema:
- type: object
- properties:
- spec:
- type: object
- required:
- - url
- - format
- properties:
- url:
- type: string
- format:
- type: string
- nodeName:
- type: string
- status:
- type: object
- properties:
- numberOfSamples:
- type: integer
- updateTime:
- type: string
- format: datatime
-
-
- additionalPrinterColumns:
- - name: NumberOfSamples
- type: integer
- description: The number of samples in the dataset
- jsonPath: ".status.numberOfSamples"
- - name: Node
- type: string
- description: The node name of the dataset
- jsonPath: ".spec.nodeName"
- - name: spec
- type: string
- description: The spec of the dataset
- jsonPath: ".spec"
-
- ```
- 1. `format` of dataset
-
- We use this field to report the number of samples for the dataset and do dataset splitting.
-
- Current we support these below formats:
-
- - txt: one nonempty line is one sample
-
- #### `Model` CRD
-
- [crd source](/build/crds/neptune/model_v1alpha1.yaml)
- ```yaml
- apiVersion: apiextensions.k8s.io/v1
- kind: CustomResourceDefinition
- metadata:
- name: models.neptune.io
- spec:
- group: neptune.io
- names:
- kind: Model
- plural: models
- scope: Namespaced
- versions:
- - name: v1alpha1
- subresources:
- # status enables the status subresource.
- status: {}
- served: true
- storage: true
- schema:
- openAPIV3Schema:
- type: object
- properties:
- spec:
- type: object
- required:
- - url
- - format
- properties:
- url:
- type: string
- format:
- type: string
- status:
- type: object
- properties:
- updateTime:
- type: string
- format: datetime
- metrics:
- type: array
- items:
- type: object
- properties:
- key:
- type: string
- value:
- type: string
-
-
- additionalPrinterColumns:
- - name: updateAGE
- type: date
- description: The update age
- jsonPath: ".status.updateTime"
- - name: metrics
- type: string
- description: The metrics
- jsonPath: ".status.metrics"
-
- ```
-
- ### CRD type definition
- - `Dataset`
-
- [go source](cloud/pkg/apis/neptune/v1alpha1/dataset_types.go)
-
- ```go
- package v1alpha1
-
- import (
- metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
- )
-
- // +genclient
- // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
-
- // Dataset describes the data that a dataset resource should have
- type Dataset struct {
- metav1.TypeMeta `json:",inline"`
-
- metav1.ObjectMeta `json:"metadata,omitempty"`
-
- Spec DatasetSpec `json:"spec"`
- Status DatasetStatus `json:"status"`
- }
-
- // DatasetSpec is a description of a dataset
- type DatasetSpec struct {
- URL string `json:"url"`
- Format string `json:"format"`
- NodeName string `json:"nodeName"`
- }
-
- // DatasetStatus represents information about the status of a dataset
- // including the time a dataset updated, and number of samples in a dataset
- type DatasetStatus struct {
- UpdateTime *metav1.Time `json:"updateTime,omitempty" protobuf:"bytes,1,opt,name=updateTime"`
- NumberOfSamples int `json:"numberOfSamples"`
- }
-
- // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
-
- // DatasetList is a list of Datasets
- type DatasetList struct {
- metav1.TypeMeta `json:",inline"`
- metav1.ListMeta `json:"metadata"`
-
- Items []Dataset `json:"items"`
- }
-
- ```
-
- - `Model`
-
- [go source](cloud/pkg/apis/neptune/v1alpha1/model_types.go)
- ```go
- package v1alpha1
-
- import (
- metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
- )
-
- // +genclient
- // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
-
- // Model describes the data that a model resource should have
- type Model struct {
- metav1.TypeMeta `json:",inline"`
-
- metav1.ObjectMeta `json:"metadata,omitempty"`
-
- Spec ModelSpec `json:"spec"`
- Status ModelStatus `json:"status"`
- }
-
- // ModelSpec is a description of a model
- type ModelSpec struct {
- URL string `json:"url"`
- Format string `json:"format"`
- }
-
- // ModelStatus represents information about the status of a model
- // including the time a model updated, and metrics in a model
- type ModelStatus struct {
- UpdateTime *metav1.Time `json:"updateTime,omitempty" protobuf:"bytes,1,opt,name=updateTime"`
- Metrics []Metric `json:"metrics,omitempty" protobuf:"bytes,2,rep,name=metrics"`
- }
-
- // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
-
- // ModelList is a list of Models
- type ModelList struct {
- metav1.TypeMeta `json:",inline"`
- metav1.ListMeta `json:"metadata"`
-
- Items []Model `json:"items"`
- }
-
- ```
-
- ### Crd samples
- - `Dataset`
-
- ```yaml
- apiVersion: neptune.io/v1alpha1
- kind: Dataset
- metadata:
- name: "dataset-examp"
- spec:
- url: "/code/data"
- format: "txt"
- nodeName: "edge0"
- ```
-
- - `Model`
-
- ```yaml
- apiVersion: neptune.io/v1alpha1
- kind: Model
- metadata:
- name: model-examp
- spec:
- url: "/model/frozen.pb"
- format: pb
- ```
-
-
- ## Controller Design
- In the current design there is downstream/upstream controller for `dataset`, no downstream/upstream controller for `model`.<br/>
-
- The dataset controller synchronizes the dataset between the cloud and edge.
- - downstream: synchronize the dataset info from the cloud to the edge node.
- - upstream: synchronize the dataset status from the edge to the cloud node, such as the information how many samples the dataset has.
- <br/>
-
- Here is the flow of the dataset creation:
-
- 
-
- For the model:
- 1. Model's info will be synced when sync the federated-task etc which uses the model.
- 1. Model's status will be updated when the corresponding training/inference work has completed.
-
|