Deploying a People Counter Application at the Edge Using Intel OpenVINO Toolkit
Using Intel OpenVINO Toolkit to create awesome applications!
The people counter application demonstrates how to create a smart video IoT solution using Intel® hardware and software tools. The application detects people in a designated area, providing the number of people in the frame, average duration of people in frame, and total count. This project is a part of Intel Edge AI for IOT Developers Nanodegree program by udacity.
Intel OpenVINO Toolkit
Open Visual Inference and Neural Network Optimization is a toolkit provided by Intel to carry out faster inference of deep learning models. It contains two main parts i.e. Model Optimizer and Inference Engine. OpenVINO toolkit can deploy many types of computer vision and deep learning models at the Edge. Thus, mainly dependent on Convolutional Neural Networks and Computer Vision models for carrying out the predictions.
Also, OpenVINO supports various types of devices like CPU, GPU, TPU, Neural Compute Stick and FPGA. In this project, OpenVINO CPU toolkit is used to identify people in a particular frame of the video. Also, the average duration of a particular person is evaluated and displayed in the output video.
Model Optimizer and Inference Engine
The two main components of OpenVINO toolkit are Model Optimizer and Inference Engine. In short, the Model Optimizer takes in a pre-trained model, optimizes the model and converts it into its intermediate representation (.xml and .bin file). Generally, a pre-trained model is one which is used previously on some similar problem set. The time that would be saved in building the model from scratch, would be used in optimizing the model for better results and accuracy. The IR (Intermediate Representation) is fed into the Inference Engine which is a library written in C++ and provides an API to read the Intermediate Representation. On the other hand, the Inference Engine helps in proper execution of the model on different devices. It manages the libraries required to run the code properly on different platforms.
How the application works?
The counter uses the Inference Engine included in the Intel® Distribution of OpenVINO™ Toolkit. The model used is be able to identify people in a video frame. The application counts the number of people in the current frame, the duration that a person is in the frame (time elapsed between entering and exiting a frame) and the total count of people. It then sends the data to a local web server using the Paho MQTT Python package.
The architecture diagram of the application including the Main, Mosca Server, UI and FFmpeg is clearly shown below:
Application Requirements
Hardware Requirements
- 6th to 10th generation Intel® Core™ processor with Iris® Pro graphics or Intel® HD Graphics.
- OR use of Intel® Neural Compute Stick 2 (NCS2)
Software Requirements
- Intel® Distribution of OpenVINO™ toolkit 2019 R3 release
- Node v6.17.1
- Npm v3.10.10
- CMake
- MQTT Mosca server
- Python 3.5 or 3.6
Setup
There are three components that need to be running in separate terminals for this application to work:
To start the three servers in separate terminal windows, following commands should be executed from the main directory:
- For MQTT/Mosca server:
cd webservice/server
npm install
- For Web server:
cd ../ui
npm install
- For FFmpeg Server:
sudo apt install ffmpeg
Explaining Model Selection
TensorFlow Object Detection Model Zoo contains many pre-trained models on the coco dataset. For this project, various classes of models were tested from the TensorFlow Object Detection Model Zoo. SSD_inception_v2_coco and faster_rcnn_inception_v2_coco performed good as compared to rest of the models, but, in this project, faster_rcnn_inception_v2_coco is used which is fast in detecting people with less errors. Intel openVINO already contains extensions for custom layers used in TensorFlow Object Detection Model Zoo.
Downloading the model from the GitHub repository of Tensorflow Object Detection Model Zoo by the following command:
wget http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
Extracting the tar.gz file by the following command:
tar -xvf faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
Changing the directory to the extracted folder of the downloaded model:
cd faster_rcnn_inception_v2_coco_2018_01_28
The model can’t be the existing models provided by Intel as a part of the Udacity acceptance criteria of the Nanodegree project. So, converting the TensorFlow model to Intermediate Representation (IR) or OpenVINO IR format. The command used is given below:
python /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model faster_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb --tensorflow_object_detection_api_pipeline_config pipeline.config --reverse_input_channels --tensorflow_use_custom_operations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/faster_rcnn_support.json
Comparing Model Performance
Model-1: Ssd_inception_v2_coco_2018_01_28
Converted the model to intermediate representation using the following command. Further, this model lacked accuracy as it didn’t detect people correctly in the video. Made some alterations to the threshold for increasing its accuracy but the results were not fruitful. Changed the threshold to 0.4 instead of 0.6, but the results were not up to the mark.
python /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model ssd_inception_v2_coco_2018_01_28/frozen_inference_graph.pb --tensorflow_object_detection_api_pipeline_config pipeline.config --reverse_input_channels --tensorflow_use_custom_operations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json
Model-2: Faster_rcnn_inception_v2_coco_2018_01_28
Converted the model to intermediate representation using the following command. Model-2 i.e. Faster_rcnn_inception_v2_coco, performed really well in the output video. After using a threshold of 0.4, the model works better than all the previous approaches.
python /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model faster_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb --tensorflow_object_detection_api_pipeline_config pipeline.config --reverse_input_channels --tensorflow_use_custom_operations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/faster_rcnn_support.json
Comparison of the two models
Compared the two models i.e. ssd_inception_v2_coco and faster_rcnn_inception_v2_coco in terms of latency and memory, several insights were drawn. It could be clearly seen that the Latency (microseconds) and Memory (Mb) decreases in case of OpenVINO as compared to plain Tensorflow model which is very useful in case of OpenVINO applications or Edge computing!
Model Use Cases
This application could keep a check on the number of people in a particular area and could be helpful where there is restriction on the number of people present in a particular area. Further, with some updations, this could also prove helpful in the current COVID-19 scenario i.e. to keep a check on the number of people in the frame.
Running the Main Application
After converting the downloaded model to the OpenVINO IR format, all the three servers can be started on separate terminals i.e.
- MQTT Mosca server
- Node.js* Web server
- FFmpeg server
Setting up the environment
Configuring the environment to use the Intel® Distribution of OpenVINO™ toolkit one time per terminal session by running the following command:
source /opt/intel/openvino/bin/setupvars.sh -pyver 3.5
Further, from the main directory:
Step 1 — Starting the Mosca server
cd webservice/server/node-server
node ./server.js
The following message is displayed, if successful:
Mosca server started.
Step 2 — Starting the GUI
Opening a new terminal and executing below commands:
cd webservice/ui
npm run dev
The following message is displayed, if successful:
webpack: Compiled successfully
Step 3 — FFmpeg Server
Opening a new terminal and executing below command:
sudo ffserver -f ./ffmpeg/server.conf
Step 4 — Running the code
Opening a new terminal and executing below command:
python main.py -i resources/Pedestrian_Detect_2_1_1.mp4 -m faster_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.xml -l /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_sse4.so -d CPU -pt 0.4 | ffmpeg -v warning -f rawvideo -pixel_format bgr24 -video_size 768x432 -framerate 24 -i - http://0.0.0.0:3004/fac.ffm
Implementation on Local Machine
The project is implemented on Udacity’s workspace. If the same project is to be implemented on the local machine, some changes have to be made in “constants.js” file located at the given path from the root directory of the project:
webservice/ui/src/constants/constants.js
In the “constants.js” file, “CAMERA_FEED_SERVER” and “MQTT_SERVER” are configured according to the Udacity’s workspace configuration. The value of these two properties should be changed to the following for proper implementation:
CAMERA_FEED_SERVER: "http://localhost:3004"
...
MQTT_SERVER: "ws://localhost:3002"
References
This project borrows the boiler-plate code from the Intel OpenVINO fundamentals project starter repository from Udacity on GitHub which can be found on the link below.
To explore other exciting projects by Intel, check out the Intel IoT Developer Kit on GitHub which includes IoT Libraries & Code Samples from Intel.
The project can be found on my GitHub repository with detailed instructions. Main.py and inference.py contains detailed code about the functioning of the application.
And with that, we have come to the end of this article. Bundle of thanks for reading it!