You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

install.md 8.8 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336
  1. * [Prerequisites](#prerequisites)
  2. * [Download project source](#download-source)
  3. * [Create CRDs](#create-crds)
  4. * [Deploy GM](#deploy-gm)
  5. * [Prepare GM config](#prepare-gm-config)
  6. * [Build worker base images](#build-worker-base-images)
  7. * [Run GM as k8s deployment(recommended)](#run-gm-as-k8s-deploymentrecommended)
  8. * [Run GM as a single process(alternative)](#run-gm-as-a-single-processalternative)
  9. * [Run GM as docker container(alternative)](#run-gm-as-docker-containeralternative)
  10. * [Deploy LC](#deploy-lc)
  11. ## Deploy Neptune
  12. ### Prerequisites
  13. - [GIT][git_tool]
  14. - [GO][go_tool] version v1.15+.
  15. - [Kubernetes][kubernetes] 1.16+.
  16. - [KubeEdge][kubeedge] version v.15+.
  17. GM will be deployed to a node which has satisfied these requirements:
  18. 1. Has a public IP address which the edge can access to.
  19. 1. Can access the k8s master.
  20. Simply you can use the node which `cloudcore` of `kubeedge` is deployed at.
  21. The shell commands below should to be executed in this node and **one terminal session** in case keeping the shell variables.
  22. ### Download source
  23. ```shell
  24. git clone http://github.com/edgeai-neptune/neptune.git
  25. cd neptune
  26. git checkout master
  27. ```
  28. ### Create CRDs
  29. ```shell
  30. # create these crds including dataset, model, joint-inference
  31. kubectl apply -f build/crds/neptune/
  32. ```
  33. ### Deploy GM
  34. #### Prepare GM config
  35. Get `build/gm/gm-config.yaml` for a copy
  36. ```yaml
  37. kubeConfig: ""
  38. master: ""
  39. namespace: ""
  40. imageHub:
  41. "tensorflow:1.15": "docker.io/neptune/tensorflow-base-image-to-filled:1.15"
  42. websocket:
  43. address: 0.0.0.0
  44. port: 9000
  45. localController:
  46. server: http://localhost:9100
  47. ```
  48. 1. `kubeConfig`: config to connect k8s, default `""`
  49. 1. `master`: k8s master addr, default `""`
  50. 1. `namespace`: the namespace GM watches, `""` means that gm watches all namespaces, default `""`.
  51. 1. `imageHub`: the base image mapping for model training/evaluation/inference which key is frameworkType/frameVersion.
  52. 1. `websocket`: since the current limit of kubeedge(1.5), GM needs to build the websocket channel for communicating between GM and LCs.
  53. 1. `localController`:
  54. - `server`: to be injected into the worker to connect LC.
  55. #### Build worker base images
  56. Here build worker base image for tensorflow 1.15 for example:
  57. ```shell
  58. # here using github container registry for example.
  59. # edit it with the truly container registry by your choice.
  60. IMAGE_REPO=ghcr.io/edgeai-neptune/neptune
  61. # build tensorflow image
  62. WORKER_TF1_IMAGE=$IMAGE_REPO/worker-tensorflow:1.15
  63. docker build -f build/worker/base_images/tensorflow/tensorflow-1.15.Dockerfile -t $WORKER_TF1_IMAGE .
  64. # push worker image to registry, login to registry first if needed
  65. docker push $WORKER_TF1_IMAGE
  66. ```
  67. There are some methods to run gm, you can choose one method below:
  68. #### Run GM as k8s deployment(**recommended**):
  69. We don't need to config the kubeconfig in this method said by [accessing the API from a Pod](https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster/#accessing-the-api-from-a-pod).
  70. 1\. Create the cluster role in case that gm can access/write the CRDs:
  71. ```shell
  72. # create the cluster role
  73. kubectl create -f build/gm/rbac/
  74. ```
  75. 2\. Prepare the config:
  76. ```shell
  77. # edit it with another number if you wish
  78. GM_PORT=9000
  79. LC_PORT=9100
  80. # here using github container registry for example
  81. # edit it with the truly container registry by your choice.
  82. IMAGE_REPO=ghcr.io/edgeai-neptune/neptune
  83. IMAGE_TAG=v1alpha1
  84. LC_SERVER="http://localhost:$LC_PORT"
  85. ```
  86. ```shell
  87. # copy and edit CONFIG_FILE.
  88. CONFIG_FILE=gm-config.yaml
  89. cp build/gm/gm-config.yaml $CONFIG_FILE
  90. # prepare the config with empty kubeconfig and empty master url meaning accessing k8s by rest.InClusterConfig().
  91. # here using sed command, alternative you can edit the config file manully.
  92. sed -i 's@kubeConfig:.*@kubeConfig: ""@' $CONFIG_FILE
  93. sed -i 's@master:.*@master: ""@' $CONFIG_FILE
  94. sed -i "s@port:.*@port: $GM_PORT@" $CONFIG_FILE
  95. # setting tensorflow1.15 base image
  96. sed -i 's@\("tensorflow:1.15":\).*@\1 '"$WORKER_TF1_IMAGE@" $CONFIG_FILE
  97. # setting lc server
  98. sed -i "s@http://localhost:9100@$LC_SERVER@" $CONFIG_FILE
  99. ```
  100. 3\. Build the GM image:
  101. ```shell
  102. # build image from source OR use the gm image previous built.
  103. # edit it with the truly base repo by your choice.
  104. GM_IMAGE=$IMAGE_REPO/gm:$IMAGE_TAG
  105. make gmimage IMAGE_REPO=$IMAGE_REPO IMAGE_TAG=$IMAGE_TAG
  106. # push image to registry, login to registry first if needed
  107. docker push $GM_IMAGE
  108. ```
  109. 4\. Create gm configmap:
  110. ```shell
  111. # create configmap from $CONFIG_FILE
  112. CONFIG_NAME=gm-config # customize this configmap name
  113. kubectl create -n neptune configmap $CONFIG_NAME --from-file=$CONFIG_FILE
  114. ```
  115. 5\. Deploy GM as deployment:
  116. ```shell
  117. # we assign gm to the node which edge node can access to.
  118. # here current terminal node name, i.e. the k8s master node.
  119. # remember the GM_IP
  120. GM_NODE_NAME=$(hostname)
  121. kubectl apply -f - <<EOF
  122. apiVersion: v1
  123. kind: Service
  124. metadata:
  125. name: gm
  126. namespace: neptune
  127. spec:
  128. selector:
  129. app: gm
  130. type: NodePort
  131. ports:
  132. - protocol: TCP
  133. port: $GM_PORT
  134. targetPort: $GM_PORT
  135. ---
  136. apiVersion: apps/v1
  137. kind: Deployment
  138. metadata:
  139. name: gm
  140. labels:
  141. app: gm
  142. namespace: neptune
  143. spec:
  144. replicas: 1
  145. selector:
  146. matchLabels:
  147. app: gm
  148. template:
  149. metadata:
  150. labels:
  151. app: gm
  152. spec:
  153. nodeName: $GM_NODE_NAME
  154. serviceAccountName: neptune
  155. containers:
  156. - name: gm
  157. image: $GM_IMAGE
  158. command: ["neptune-gm", "--config", "/config/$CONFIG_FILE", "-v2"]
  159. volumeMounts:
  160. - name: gm-config
  161. mountPath: /config
  162. resources:
  163. requests:
  164. memory: 32Mi
  165. cpu: 100m
  166. limits:
  167. memory: 128Mi
  168. volumes:
  169. - name: gm-config
  170. configMap:
  171. name: $CONFIG_NAME
  172. EOF
  173. ```
  174. 6\. Check the GM status:
  175. ```shell
  176. kubectl get deploy -n neptune gm
  177. ```
  178. #### Run GM as a single process(alternative)
  179. 1\. config GM:
  180. ```shell
  181. cp build/gm/neptune-gm.yaml gm.yaml
  182. # make sure /root/.kube/config exists
  183. sed -i 's@kubeConfig.*@kubeConfig: /root/.kube/config@' gm.yaml
  184. ```
  185. 2\. compile and run GM direct:
  186. ```shell
  187. go build cmd/neptune-gm/neptune-gm.go
  188. ./neptune-gm --config gm.yaml -v2
  189. ```
  190. #### Run GM as docker container(alternative)
  191. 1\. build GM image:
  192. ```shell
  193. GM_IMAGE=$IMAGE_REPO/gm:$IMAGE_TAG
  194. sed -i 's@kubeConfig.*@kubeConfig: /root/.kube/config@' build/gm/neptune-gm.yaml
  195. make gmimage IMAGE_REPO=$IMAGE_REPO IMAGE_TAG=$IMAGE_TAG
  196. ```
  197. 2\. run GM as container:
  198. ```shell
  199. docker run --net host -v /root/.kube:/root/.kube $GM_IMAGE
  200. ```
  201. ### Deploy LC
  202. Prerequisites:
  203. 1. Run GM successfully.
  204. 2. Get the bind address/port of GM.
  205. Steps:
  206. 1\. Build LC image:
  207. ```shell
  208. LC_IMAGE=$IMAGE_REPO/lc:$IMAGE_TAG
  209. make lcimage IMAGE_REPO=$IMAGE_REPO IMAGE_TAG=$IMAGE_TAG
  210. # push image to registry, login to registry first if needed
  211. docker push $LC_IMAGE
  212. ```
  213. 2\. Deploy LC as k8s daemonset:
  214. ```shell
  215. gm_node_port=$(kubectl -n neptune get svc gm -ojsonpath='{.spec.ports[0].nodePort}')
  216. # fill the GM_NODE_NAME's ip which edge node can access to.
  217. # such as gm_node_ip=192.168.0.9
  218. # gm_node_ip=<GM_NODE_NAME_IP_ADDRESS>
  219. # here try to get node ip by kubectl
  220. gm_node_ip=$(kubectl get node $GM_NODE_NAME -o jsonpath='{ .status.addresses[?(@.type=="ExternalIP")].address }')
  221. gm_node_internal_ip=$(kubectl get node $GM_NODE_NAME -o jsonpath='{ .status.addresses[?(@.type=="InternalIP")].address }')
  222. GM_ADDRESS=${gm_node_ip:-$gm_node_internal_ip}:$gm_node_port
  223. kubectl create -f- <<EOF
  224. apiVersion: apps/v1
  225. kind: DaemonSet
  226. metadata:
  227. labels:
  228. k8s-app: neptune-lc
  229. name: lc
  230. namespace: neptune
  231. spec:
  232. selector:
  233. matchLabels:
  234. k8s-app: lc
  235. template:
  236. metadata:
  237. labels:
  238. k8s-app: lc
  239. spec:
  240. containers:
  241. - name: lc
  242. image: $LC_IMAGE
  243. env:
  244. - name: GM_ADDRESS
  245. value: $GM_ADDRESS
  246. - name: BIND_PORT
  247. value: "$LC_PORT"
  248. - name: NODENAME
  249. valueFrom:
  250. fieldRef:
  251. fieldPath: spec.nodeName
  252. - name: ROOTFS_MOUNT_DIR
  253. # the value of ROOTFS_MOUNT_DIR is same with the mount path of volume
  254. value: /rootfs
  255. resources:
  256. requests:
  257. memory: 32Mi
  258. cpu: 100m
  259. limits:
  260. memory: 128Mi
  261. volumeMounts:
  262. - name: localcontroller
  263. mountPath: /rootfs
  264. volumes:
  265. - name: localcontroller
  266. hostPath:
  267. path: /
  268. restartPolicy: Always
  269. hostNetwork: true
  270. EOF
  271. ```
  272. 3\. Check the LC status:
  273. ```shell
  274. kubectl get ds lc -n neptune
  275. kubectl get pod -n neptune
  276. ```
  277. [git_tool]:https://git-scm.com/downloads
  278. [go_tool]:https://golang.org/dl/
  279. [kubeedge]:https://github.com/kubeedge/kubeedge
  280. [kubernetes]:https://kubernetes.io/