Object detection with video decoding sample
This sample shows how to run video decoding and object detection using pre-trained YoloV2
Caffe Model
The sample demonstrates the use of mv_compile utility to do video decoding and inference.
The sample has two .cpp files, mvobjdetect.cpp
and visualize.cpp
. But it needs extra header files, .cpp files, and inference deployment library which is generated by mv_compile utility to completely build and execute the application.
Prerequisites
- MIVisionX
- MIVisionX installs model compiler at
/opt/rocm/libexec/mivisionx
- mv_compile installs at
/opt/rocm/bin
and mvdeploy_api.h installs at/opt/rocm/include/mivisionx
- MIVisionX installs model compiler at
- Install MIVisionX Model Compiler Prerequisites
- Add MIVisionX libraries & executables to PATH
export PATH=$PATH:/opt/rocm/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib
Sample - Detection Using Pre-Trained CAFFE Model
Step 1. Download pre-trained YoloV2 caffe model - yoloV2Tiny20.caffemodel
[!TIP]
wget https://github.com/kiritigowda/YoloV2NCS/raw/master/models/caffemodels/yoloV2Tiny20.caffemodel
Step 2. compile model for OpenVX backend using mv_compile utility
The mv_compile utility generates deployment library, header files, and .cpp files required to run inference for the specified model.
- Usage:
mv_compile --model <model_name: name of the trained model with path> [required] --install_folder <install_folder: the location for compiled model> [required] --input_dims <input_dims: n,c,h,w - batch size, channels, height, width> [required] --backend <backend: name of the backend for compilation> [optional - default:OpenVX_Rocm_GPU] --fuse_cba <fuse_cba: enable or disable Convolution_bias_activation fuse mode (0/1)> [optional - default: 0] --quant_mode <quant_mode: fp32/fp16 - quantization mode: if enabled the model and weights are converted> [optional -default: fp32]
[!TIP]
mv_compile --model yoloV2Tiny20.caffemodel --install_folder mvdeploy --input_dims 1,3,416,416
[!NOTE]
- There will be a file libmv_deploy.so (under ./lib), weights.bin and mvtestdeploy sample app (under ./bin)
- There will be mv_extras folder for extra post-processing helper functions
- Open mvdeploy_api.h to go through API functions supported for inference deployment
Step 3. Make sure mvtestdeploy utility runs
mvtestdeploy is a pre-generated application built in Step 3 which shows how to deploy inference for an input image file
- Usage:
./bin/mvtestdeploy <input data file: image/video> [required] <output file - (.bin)> [required] --install_folder <folder name or '.' for current folder> [required] --t <N: number of iterations> [required]
[!TIP]
cd mvdeploy ./bin/mvtestdeploy /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG output.bin --install_folder . --t 100
[!NOTE] This runs inference for an input file and generate output for N number of iterations.
Step 4. Build mvobjdetect example
- mvobjdetect is built on top of all the files generated in Step 4. Basically it shows how to add preprocessing OpenVX nodes for video decoding and image_to_tensor conversion.
- Go through mvobjdetect.cpp file.
- This exercise uses a single or multiple video streams for input.
- The second part of the tutorial shows how to run it through multiple video files.
TODO:
- Copy all files in cloned sample folder (mvobjdetect.cpp, visualize.cpp, visualize.h and CMakeLists.txt) into mvdeploy folder. This brings all the files into a single folder to build and run the sample.
[!TIP]
cp /opt/rocm/share/mivisionx/samples/mv_objdetect/mvobjdetect.cpp . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/visualize.cpp . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/visualize.h . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/CMakeLists.txt .
Step 5. cmake and make mvobjdetect
[!TIP]
mkdir mv_build cd mv_build cmake ../ make -j
[!NOTE] If build directory exists from previous build, name the new build directly differently (eg: mv_build).
Step 6. Run object detection with video/image
- Usage:
```
Usage: mvobjdetect
<input-data-file: .jpg, .png, .mp4, .m4v>: is filename(s) to initialize input tensor [required] <output-data-file/- >: for video all frames will be output to single file OR '-'for no output [required] --install_folder : the location for compiled module [required] --bb <channels, threshold_c threshold_nms> bounding box detection parameters [required] --frames <#num/eof/loop> : num of frames to process inference [optional: default till eof, loop: loop the video] --backend : is the name of the backend for compilation [optional: default OpenVX_Rocm_GPU] --argmax : give argmax output in vec<label,prob> [optional: default no argmax] --t to run for performance [optional: default 1] --hwdec <1/0>:use hwaccel for decoding [optional: default cpu decoding, 1:hwdec 0:cpu dec] --label [optional: default use yolo_v2 20 classes] --v :if specified visualize the result on the input image [optional: default no visualization]
```
[!TIP]
cd .. ./mv_build/mvobjdetect /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG --install_folder . --bb 20 0.2 0.4 --v
[!NOTE]
- Make sure the input image and video file exist and specified with full path
- For Video -
./mv_build/mvobjdetect <PATH TO VIDEO FILE> --install_folder . --bb 20 0.2 0.4 --v
- For hardware decode use
--hwdec 1
Step 7. Run object detection with multiple video streams (e.g batch 4, 8 and 16)
- Go through steps 2 to 6, this time compiling the model for a batch of
4
using--input_dims 4,3,416,416
in step 2 . - Also this sample can do batch of 8 and 16 decoding as well.
- For batch 8:
--input_dims 8,3,416,416
and createVideos_8.txt
file with 8 input streams - For batch 16:
--input_dims 16,3,416,416
and createVideos_16.txt
file with 16 input streams
- For batch 8:
[!TIP] For batch size
4
mkdir mvcompile-4 && cd mvcompile-4 wget https://github.com/kiritigowda/YoloV2NCS/raw/master/models/caffemodels/yoloV2Tiny20.caffemodel mv_compile --model yoloV2Tiny20.caffemodel --install_folder mvdeploy-4 --input_dims 4,3,416,416 cd mvdeploy-4/ cp /opt/rocm/share/mivisionx/samples/mv_objdetect/mvobjdetect.cpp . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/visualize.cpp . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/visualize.h . cp /opt/rocm/share/mivisionx/samples/mv_objdetect/CMakeLists.txt . mkdir mv_build && cd mv_build cmake ../ && make -j8 cd ../ ./mv_build/mvobjdetect <Videos_4.txt> --install_folder . --bb 20 0.2 0.4 --v
[!IMPORTANT] *
Videos_4.txt
has the names of input video files with full path.
- Use the
../data/Videos/Videos_4.txt
file. Modify it to specify your input files.- The last bit
(:0 / :1)
chooses software or hardware mode for decoding.
Step 10. Sample output for multiple video object detection
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Author
Rajy Rawther - mivisionx.support@amd.com