You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 13 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216
  1. # Using ReID to Track an Infected COVID-19 Carrier in Pandemic Scenarios
  2. Estimated completion time: ~60-100 mins.
  3. Requirements:
  4. - K8s cluster
  5. - Sedna
  6. - Internet connection to download the containers images
  7. - Optional: Kubeedge
  8. - Optional: multi-node cluster
  9. # Introduction
  10. This proposal introduces an edge-cloud distributed system to help identify and track potential carriers of the COVID-19 virus. By detecting proximity or contact risks, we have the ability to monitor the possible contamination of bystanders. The goal is to counter the spread of the virus and help against the global pandemic crisis.
  11. The example images below show the ability of our system to re-identify a potential carrier of the virus and detect close contact proximity risk.
  12. ![image info](./1.jpg) ![image info](./2.jpg)
  13. # System Architecture and Components
  14. The image below shows the system architecture and its simplified workflow:
  15. ![image info](./arch.png)
  16. ## Components
  17. **ReID Job**: it performs the ReID.
  18. - Available for CPU only.
  19. - Folder with specific implementation `examples/multiedgetracking/reid`.
  20. - Component specs in `lib/sedna/core/multi_edge_tracking/components/reid.py`.
  21. - Defined by the Dockerfile `multi-edge-tracking-reid.Dockerfile`.
  22. **Feature Extraction Service**: it performs the extraction of the features necessary for the ReID step.
  23. - Available for CPU and GPU.
  24. - Folder with specific implementation details `examples/multiedgetracking/feature_extraction`.
  25. - Component specs in `lib/sedna/core/multi_edge_tracking/components/feature_extraction.py`.
  26. - Defined by the Dockerfile `multi-edge-tracking-feature-extraction.Dockerfile` or `multi-edge-tracking-gpu-feature-extraction.Dockerfile`.
  27. - It loads the model defined by the CRD in the YAML file `yaml/models/model_m3l.yaml`.
  28. **VideoAnalytics Job**: it performs tracking of objects (pedestrians) in a video.
  29. - Available for CPU and GPU.
  30. - Folder with specific implementation details `examples/multiedgetracking/detection`.
  31. - AI model code in `examples/multiedgetracking/detection/estimator/bytetracker.py`.
  32. - Component specs in `lib/sedna/core/multi_edge_tracking/components/detection.py`.
  33. - Defined by the Dockerfile `multi-edge-tracking-videoanalytics.Dockerfile` or `multi-edge-tracking-gpu-videoanalytics.Dockerfile`.
  34. - It loads the model defined by the CRD in the YAML file `yaml/models/model_detection.yaml`.
  35. # Build Phase
  36. Go to the `sedna/examples` directory and run: `./build_image.sh -r <your-docker-private-repo> multiedgetracking` to build the Docker images. Remember to **push** the images to your own Docker repository!
  37. Run `make crds` in the `SEDNA_HOME` and then register the new CRD in the K8S cluster with `make install crds` or:
  38. - `kubectl create -f sedna/build/crd/sedna.io_featureextractionservices.yaml`
  39. - `kubectl create -f sedna/build/crd/sedna.io_videoanalyticsjobs.yaml`
  40. - `kubectl create -f sedna/build/crd/sedna.io_reidjobs.yaml`
  41. Build the GM `make gmimage` and restart the Sedna GM pod.
  42. Additionally, this application requires to:
  43. 1. Create a NFS on the master node and network shared folder accessible by the pods through a PVC.
  44. 2. Create a PV and PVC on the K8s cluster.
  45. 3. Have a basic Kafka deployment running.
  46. 4. Have a streaming server running.
  47. We offer two installation methods:
  48. - [Manual Installation](#manual-installation): slow, highly recommended to understand how the application works.
  49. - [Automated Installation](#automated-installation): quick, but harder to debug in case of misconfigurations.
  50. # **Manual Installation**
  51. This procedure will guide you step-by-step in the process of setting up your cluster and then run the example application for pedestrian ReID in pandemic scenario. We recommend following the manual installation if your cluster setup is somewhat different from the usual configuration or customized.
  52. We also recommend following the manual setup if you are familiar with K8s concepts to fully understand which components are deployed.
  53. ## 1. NFS Server
  54. Using a local NFS allows to easily share folders between pods and the host. Also, it makes straightforward the use of PVs and PVCs which are used in this example to load volumes into the pods. However, there are other options to achieve the same result which you are free to explore.
  55. 1. To setup the NFS, run the following commands on a node of your cluster (for simplicity, we will assume that we selected the **master** node):
  56. ```
  57. sudo apt-get update && sudo apt-get install -y nfs-kernel-server
  58. sudo mkdir /data/network_shared/reid
  59. sudo mkdir /data/network_shared/reid/processed
  60. sudo mkdir /data/network_shared/reid/query
  61. sudo mkdir /data/network_shared/reid/images
  62. sudo chmod 1777 /data/network_shared/reid
  63. sudo bash -c "echo '/data/network_shared/reid *(rw,sync,no_root_squash,subtree_check)' >> /etc/exports"
  64. sudo exportfs -ra
  65. sudo showmount -e localhost # the output of this command should be the folders exposed by the NFS
  66. ```
  67. 2. If you have other nodes in your cluster, run the following commands on them:
  68. ```
  69. sudo apt-get -y install nfs-common nfs-kernel-server
  70. showmount -e nfs_server_node_ip #nfs_server_node_ip is the IP of the node where you ran the commands in step (1.)
  71. sudo mount nfs_server_node_ip:/data/network_shared/reid /data/network_shared/reid
  72. ```
  73. ## 2. PV and PVC
  74. 1. Change the server and storage capacity field in the `yaml/pv/reid_volume.yaml` as needed.
  75. 2. Run `kubectl create -f yaml/pv/reid_volume.yaml`.
  76. 3. Change the storage request field in the `yaml/pvc/reid-volume-claim.yaml` as needed.
  77. 4. Run `kubectl create -f yaml/pvc/reid-volume-claim.yaml`.
  78. ## 3. Apache Kafka
  79. 1. Edit the YAML files under `yaml/kafka` so that the IP/hostname address match the one of your master node. For a basic deployment, it's enough to have a single replica of Zookeeper and Kafka both running on the same node.
  80. 2. Run these commands:
  81. ```
  82. kubectl create -f yaml/kafka/kafkabrk.yaml
  83. kubectl create -f yaml/kafka/kafkasvc.yaml
  84. kubectl create -f yaml/kafka/zoodeploy.yaml
  85. kubectl create -f yaml/kafka/zooservice.yaml
  86. ```
  87. 3. Check that Zookeeper and the Kafka broker is healthy (check the logs, it should print that the creation of the admin topic is successful).
  88. 4. Note down your master node external IP, you will need it later to update a field in two YAML files.
  89. - If you are running on a single node deployment, the above step is not required as the default service name should be automatically resolvable by all pods using the cluster DNS (*kafka-service*).
  90. - This step is also not necessary if you are not running kubeedge.
  91. ### REST APIs
  92. This application also supports direct binding using REST APIs and edgemesh/K8s services. In case you don't want to use Kafka, you can disable it by setting `kafkaSupport: false` in the `feature-extraction.yaml` and `video-analytics-job.yaml` YAML files and just let the different components communicate using REST API. However, we recommend using Kafka at first as the rest of the tutorial assumes that it's running.
  93. ## 4. Streaming Server
  94. We use the [EasyDarwin](https://github.com/EasyDarwin/EasyDarwin) streaming server, you can have it running either on the master or edge node. Just remember to take note of the IP of the node where it's running, you will need it later.
  95. ```
  96. wget https://github.com/EasyDarwin/EasyDarwin/releases/download/v8.1.0/EasyDarwin-linux-8.1.0-1901141151.tar.gz -o ss.tar.gz
  97. tar -xzvf ss.tar.gz
  98. cd EasyDarwin-linux-8.1.0-1901141151
  99. sudo ./easydarwin
  100. ```
  101. # Application Deployment
  102. First, make sure to copy the AI models to the correct path on the nodes **BEFORE** starting the pods. If you use the YAML files provided with this example:
  103. - The node running the VideoAnalytics job should have the YoloX model in `"/data/ai_models/object_detection/pedestrians/yolox.pth"`.
  104. - The node running the feature extraction service should have the required model:`"/data/ai_models/m3l/m3l.pth"`.
  105. Do the following:
  106. - Run `kubectl create -f yaml/models/model_m3l.yaml`
  107. - Run `kubectl create -f yaml/models/model_detection.yaml`
  108. - Put into the folder `/data/network_shared/reid/query` some images of the target that will be used for the ReID.
  109. - Start the streaming server (if you didn't do it yet AND are planning to process a RTSP stream).
  110. # Running the application
  111. The provided YAML files are configured to run the feature extraction and ReID pods on the **master** node, while the VideoAnalytics runs on an **agent** node. This is configured using the *nodeSelector* option which you can edit in case you want to deploy the pods differently. For example, you can also simply **run everything on the master node**.
  112. Now, let's create the feature extraction service: `kubectl create -f yaml/feature-extraction-service.yaml` and check that it's healthy.
  113. Following, the application workflow is divided in 2 parts: analysis of the video and ReID.
  114. ## Workflow: Part 1
  115. 1. Modify the env variables in `yaml/video-analytics-job.yaml`:
  116. - Make sure that the IP in `video_address` is the same as the streaming server address (if you are using RTSP).
  117. - This field can map to an RTSP stream, and HTTP resource (CDN), or a file on disk.
  118. - We recommend setting the FPS parameter to a small value in the range [1,5] when running on CPU.
  119. 2. Create the VideoAnalytics job: `kubectl create -f yaml/video-analytics-job.yaml`
  120. 3. Send a video to the streaming server using FFMPEG, for example: `ffmpeg -re -i filename.mp4 -vcodec libx264 -f rtsp rtsp://<RTSP_SERVER_IP>/video/0`
  121. 4. If everything was setup correctly, the pod will start the processing of the video and move to the `Succeeded` phase when done.
  122. 5. **NOTE**: Keep in mind that, depending on the characteristics of the input video, this steps can take a considerable amount of time to complete especially if you are running on CPU. Moreover, this job will not exit until it receives all the results generated from the feature extraction service. Check the VideoAnalytics job and its logs to check the progress status.
  123. ## Workflow: Part 2
  124. 1. Modify the env variables in `yaml/reid-job.yaml`:
  125. - Make sure that `query_image` is a **pipe-separated** list of images matching the content of the `/data/query` folder.
  126. 2. Create the ReID job: `kubectl create -f yaml/reid-job.yaml`
  127. 3. If everything was setup correctly, the pod will start the target search in the frames extracted from the video and move to the `Succeeded` phase when done.
  128. 4. Finally, in the folder `/data/network_shared/reid/images` you will find the final results.
  129. # Cleanup
  130. Don't forget to delete the jobs once they are completed:
  131. - `k delete -f multiedgetracking/yaml/video-analytics-job.yaml`
  132. - `k delete -f multiedgetracking/yaml/reid-job.yaml`
  133. To also delete the feature extraction service:
  134. - `k delete -f multiedgetracking/yaml/feature-extraction.yaml`
  135. # **Automated Installation**
  136. The automated installation procedure will run for you the majority of the configuration steps and prepare the cluster to run the application. If something goes wrong, it will prompt the user with an error message. There are three scripts available in the `tutorial` folder:
  137. 1. The `deploy.sh` script setups the cluster and bootstraps the required components, it has to be **run at least once** before running `run.sh`. It assumes that:
  138. - You didn't change anything manually in the provided YAML files except for the ENV variables injected in the application pods (feature extraction, video analytics and reid).
  139. - You are using Kafka as data exchange layer. If you disabled Kafka, you can't use the automated deployment script!
  140. - You input the **correct** values to the `deploy.sh` script before launching it:
  141. - External IP of the master node.
  142. - Path to the NFS (depending on your configuration, this can also be the default value).
  143. - Example: `./deploy.sh -a 10.1.1.1 -p /data/network_shared`
  144. 2. The `run.sh` script will run the application on your cluster. It assumes that:
  145. - The environment variables for the VideoAnalytics and ReID job are configured correctly as explained in the [Running the application](#running-the-application) section above.
  146. 3. The `cleanup.sh` script can be used to perform a complete cleanup of the resources created on the cluster except for the NFS directory, that you have to delete manually.
  147. All the scripts must be launched from the tutorial folder and require `sudo`. Also, the `deploy.sh` will create a `backup` folder where it stores the original version of the YAML files in case you want to revert the changes performed by the scripts.
  148. ## What the automated installation won't do for you
  149. 1. Put the AI model in the correct directory.
  150. 2. Add the query images used by ReID to find a target.
  151. 3. Configure hyperparameters for pedestrian detection and ReID.