OpenCV in C++ — Object-Detection Tutorial

Object detection identifies and localises one or more classes of objects in an image— usually by returning bounding boxes (bb) and class scores. In OpenCV we can:

Use handcrafted features (Haar, HOG, LBP).
Leverage traditional ML (cv::CascadeClassifier, SVM).
Run deep neural nets via cv::dnn.
Subtract background for motion-based region proposals.

Below, sections progress from simple background subtraction to state-of-the-art YOLO.

2.1 Methods `cv::createBackgroundSubtractorMOG2` / `KNN`

cv::Ptr<cv::BackgroundSubtractor> backSub =
    cv::createBackgroundSubtractorMOG2();   // or ...KNN()

cv::VideoCapture cap(0);
cv::Mat frame, mask, fg;
while(cap.read(frame)){
    backSub->apply(frame, mask);           // learn + segment
    cv::erode(mask, mask, {}, { -1,-1 }, 1);
    cv::dilate(mask, mask, {}, { -1,-1 }, 2);

    frame.copyTo(fg, mask);                // show foreground pixels
    cv::imshow("FG Mask", mask);
    cv::imshow("Objects", fg);
    if(cv::waitKey(1)==27) break;
}

2.2 Contour Extraction

std::vector<std::vector<cv::Point>> contours;
cv::findContours(mask, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
for(const auto &c : contours){
    if(cv::contourArea(c) < 500) continue; // ignore small blobs
    cv::Rect bb = cv::boundingRect(c);
    cv::rectangle(frame, bb, {0,255,0}, 2);
}

Below is a concise reference for the most-used public methods of cv::CascadeClassifier. Each entry lists the signature, purpose, and a small code snippet to cement the idea. (All snippets assume you have already included <opencv2/objdetect.hpp>.)

1. `bool load(const std::string &xmlPath)`

Instantiates the cascade from a trained XML. Returns true on success. If you ship your model inside the bundle (iOS) or assets/ (Android), extract its absolute path first.

cv::CascadeClassifier cascade;
if(!cascade.load("haarcascade_frontalface_default.xml")){
    throw std::runtime_error("XML not found!");
}

2. `bool empty() const`

Quick guard to verify the classifier is ready:

if(cascade.empty()) { /* handle error */ }

3. `detectMultiScale` — Core Detection Call

Signature	Meaning (defaults)
`void detectMultiScale( InputArray img, std::vector<Rect>& objects, double scaleFactor = 1.1, int minNeighbors = 3, int flags = 0, Size minSize = Size(), Size maxSize = Size() )`	img — 8-bit grayscale frame. scaleFactor — how aggressively to build the image pyramid (1.05 ⇒ finer search; 1.4 ⇒ faster). minNeighbors — object validates only if clustered at least N times (higher ⇒ fewer false positives). flags — legacy; supply `0 \| CASCADE_SCALE_IMAGE`. minSize / maxSize — prune windows by absolute size.

Example — adjustable sensitivity slider

double sf   = cv::getTrackbarPos("Scale",  win)/100.0 + 1.05;
int    neigh = cv::getTrackbarPos("Nbs",    win);
cascade.detectMultiScale(gray, faces, sf, neigh);

3-bis. Extended overload (confidence scores)

void detectMultiScale(
    InputArray img,
    std::vector<Rect>& objects,
    std::vector<int>& rejectLevels,
    std::vector<double>& levelWeights,
    double scaleFactor = 1.1,
    int minNeighbors   = 3,
    int flags          = 0,
    Size minSize       = Size(),
    Size maxSize       = Size(),
    bool outputRejectLevels = true );

Handy when you need a score per box (e.g. to draw a coloured heatmap):

std::vector<Rect> boxes;
std::vector<int>  lvl;
std::vector<double> w;
cascade.detectMultiScale(gray, boxes, lvl, w, 1.1, 3, 0, cv::Size(), cv::Size(), true);
for(size_t i=0;i<boxes.size();++i){
    cv::Scalar c = w[i] > .7 ? cv::Scalar(0,255,0) : cv::Scalar(0,165,255);
    cv::rectangle(frame, boxes[i], c, 2);
}

4. `bool isOldFormatCascade() const`

Tests if the loaded XML stems from the legacy Viola-Jones format (pre-OpenCV 2). Rarely needed—useful when migrating historical datasets.

5. `bool read(const FileNode& node)`

Deserialises from an .yml/.json node (when you embed the model inside a larger OpenCV cv::FileStorage). Example:

cv::FileStorage fs("models.yml", cv::FileStorage::READ);
cascade.read(fs["face_cascade"]);

6. Mask-Aware Detection

When you wish to ignore parts of the frame (e.g. UI overlays), supply a custom MaskGenerator.

struct BlackCorners : cv::BaseCascadeClassifier::MaskGenerator{
    cv::Mat generateMask(const cv::Mat& src) override{
        cv::Mat mask(src.size(), CV_8UC1, cv::Scalar(255));
        cv::rectangle(mask, {0,0,src.cols,40}, cv::Scalar(0), cv::FILLED);
        return mask;
    }
};
cascade.setMaskGenerator(cv::makePtr<BlackCorners>());

During detectMultiScale, windows fully inside black areas will be skipped, reducing false detections in HUDs or top-bar overlays.

7. Where Do These XMLs Come From?

OpenCV’s cascade XMLs are trained with opencv_traincascade. You feed it thousands of positive patches + negatives. The tool performs Ada-Boosted feature selection, producing a multi-stage classifier that the above methods execute in real-time on CPU.

4.1 Using the Built-in Pedestrian Detector

cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());

std::vector<cv::Rect> persons;
hog.detectMultiScale(frame, persons, 0, cv::Size(8,8),
                     cv::Size(32,32), 1.05, 2);
for(auto &bb : persons)
    cv::rectangle(frame, bb, {0,0,255}, 2);

4.2 Understanding Parameters

Param	Meaning
`hitThreshold`	Decision margin of SVM. Lower → more detections.
`winStride`	Sliding-window step size.
`padding`	Gaussian padding around borders.
`scale`	Pyramid scaling between levels.
`groupThreshold`	Neighbour merges (like `minNeighbors`).

5.1 Loading a MobileNet-SSD (Caffe) Model

auto net = cv::dnn::readNetFromCaffe("deploy.prototxt",
                                     "mobilenet_iter_73000.caffemodel");
net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);  // or CUDA
net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);       // or DNN_TARGET_CUDA

5.2 Pre-Processing & Forward Pass

cv::Mat blob = cv::dnn::blobFromImage(frame, 1.0/127.5,
                     cv::Size(300,300), cv::Scalar(127.5,127.5,127.5), true, false);
net.setInput(blob);
cv::Mat out = net.forward();   // shape: [1,1,N,7]

5.3 Parsing Detections

float *data = (float*)out.data;
for(size_t i=0; i<out.total(); i+=7){
    float conf = data[i+2];
    if(conf < 0.5) continue;
    int x1 = (int)(data[i+3]*frame.cols);
    int y1 = (int)(data[i+4]*frame.rows);
    int x2 = (int)(data[i+5]*frame.cols);
    int y2 = (int)(data[i+6]*frame.rows);
    cv::rectangle(frame, {x1,y1,x2-x1,y2-y1}, {0,255,0}, 2);
}

5.4 YOLOv5 / YOLOv8 via ONNX

Convert YOLO model to ONNX → load with readNetFromONNX. Outputs require Non-Max Suppression:

std::vector<int> idx;
cv::dnn::NMSBoxes(boxes, scores, /*scoreThresh=*/0.25, /*nmsThresh=*/0.45, idx);
for(int i : idx) draw_box(boxes[i]);

Gather ≥ 2000 labelled images per class (diverse angles & lighting).
Use labelImg or Roboflow for annotation.
Augment heavily (flip, blur, HSV shift). Improves generalisation.
For classical cascades: use opencv_traincascade CLI.
For DL: fine-tune YOLOv5/8 or SSD in PyTorch, export ONNX.
Quantise (INT8) & prune for edge deployment.

Backend & Target: Select DNN_BACKEND_CUDA + DNN_TARGET_CUDA_FP16 when a GPU is available.
Batching: Run multiple frames per forward on the GPU if latency allows.
TensorRT: For NVIDIA SoCs, compile OpenCV with TensorRT for massive acceleration.
Threading: Encapsulate detection in worker threads to avoid UI blocking.
Zero Copy: On embedded systems, keep data on GPU when possible.

8.1 Bridging C++ → Swift

Add OpenCV framework
⟶ File ▸ Add Packages… → search opencv2.framework (or compile & drag-in).

Create an Objective-C++ wrapper

// DetectorWrapper.mm  (⚠️ .mm)
#include <opencv2/opencv.hpp>
extern "C" UIImage * detectObjects(UIImage *imgIOS){
    cv::Mat frame;
    UIImageToMat(imgIOS, frame);               // from cv::imgcodecs
    static cv::HOGDescriptor hog(cv::HOGDescriptor::getDefaultPeopleDetector());
    std::vector<cv::Rect> people;  hog.detectMultiScale(frame, people);
    for(auto &bb: people) cv::rectangle(frame, bb, {0,255,0}, 2);
    return MatToUIImage(frame);                // helper in OpenCV
}

Expose to Swift via a bridging header

// DetectorWrapper.h   (added to Bridging Header)
#import <UIKit/UIKit.h>
UIImage * _Nullable detectObjects(UIImage * _Nonnull imgIOS);

Call from SwiftUI/View Controller
```
let processed = detectObjects(uiImage)
```

Notes 🛈 : keep UI work on the main thread and heavy OpenCV processing on a background DispatchQueue to avoid UI stutter.

8.2 Android Studio & .NET MAUI

8.2.1 Compile OpenCV as a Shared Library (.so)

# CMakeLists.txt (excerpt)
add_library( cvdetect SHARED Detector.cpp )
find_package( OpenCV REQUIRED )
target_link_libraries( cvdetect ${OpenCV_LIBS} )

8.2.2 JNI Wrapper

// DetectorJNI.cpp
#include <jni.h>
#include <opencv2/opencv.hpp>

extern "C"
JNIEXPORT jintArray JNICALL
Java_com_example_ObjDet_detect(JNIEnv *env,jobject, jlong addr){
    cv::Mat &frame = *(cv::Mat*)addr;
    static cv::HOGDescriptor hog(cv::HOGDescriptor::getDefaultPeopleDetector());
    std::vector<cv::Rect> boxes; hog.detectMultiScale(frame, boxes);

    jintArray out = env->NewIntArray(boxes.size()*4);
    jint *buf = env->GetIntArrayElements(out,nullptr);
    for(size_t i=0;i<boxes.size();++i){
        auto bb=boxes[i]; int j=i*4;
        buf[j]=bb.x; buf[j+1]=bb.y; buf[j+2]=bb.width; buf[j+3]=bb.height;
    }
    env->ReleaseIntArrayElements(out,buf,0);
    return out;
}

8.2.3 Calling from .NET MAUI (C#)

using System.Runtime.InteropServices;

public partial class Detector {
    [DllImport("cvdetect", EntryPoint="Java_com_example_ObjDet_detect")]
    private static extern IntPtr Detect(IntPtr matAddr);

    public static Rect[] Run(SKBitmap bitmap){
        // Convert SKBitmap → cv::Mat via OpenCV for Unity or AOT-friendly helpers
        IntPtr matAddr = /* ... */;
        IntPtr arr = Detect(matAddr);
        // marshal jintArray → Rect[]
    }
}

Packaging: copy armeabi-v7a, arm64-v8a, & x86_64 → Platforms/Android.
Permissions: add <uses-feature android:name="android.hardware.camera" /> in AndroidManifest.
Build Type: ensure Release uses -O3 -s flags for native code.

With these wrappers, your cross-platform MAUI UI remains in C# while the heavy lifting executes inside an optimised C++ OpenCV library.

OpenCV in C++ — Hands-On Guide to Object Detection

1 · What Is Object Detection?

2 · Motion-Based Detection with Background Subtraction

2.1 Methods `cv::createBackgroundSubtractorMOG2` / `KNN`

2.2 Contour Extraction

3.4 · Deep-Dive into `cv::CascadeClassifier` Methods

1. `bool load(const std::string &xmlPath)`

2. `bool empty() const`

3. `detectMultiScale` — Core Detection Call

3-bis. Extended overload (confidence scores)

4. `bool isOldFormatCascade() const`

5. `bool read(const FileNode& node)`

6. Mask-Aware Detection

7. Where Do These XMLs Come From?

4 · HOG-Descriptor with SVM

4.1 Using the Built-in Pedestrian Detector

4.2 Understanding Parameters

5 · Modern Deep-Learning Detectors via `cv::dnn`

5.1 Loading a MobileNet-SSD (Caffe) Model

5.2 Pre-Processing & Forward Pass

5.3 Parsing Detections

5.4 YOLOv5 / YOLOv8 via ONNX

6 · Tips for Training a Custom Model

7 · Performance Guidelines

8 · Embedding C++ OpenCV Object-Detection in Mobile Apps (Swift & Android / .NET MAUI)

8.1 Bridging C++ → Swift

8.2 Android Studio & .NET MAUI

8.2.1 Compile OpenCV as a Shared Library (.so)

8.2.2 JNI Wrapper

8.2.3 Calling from .NET MAUI (C#)

OpenCV in C++ — Hands-On Guide to Object Detection

1 · What Is Object Detection?

2 · Motion-Based Detection with Background Subtraction

2.1 Methods cv::createBackgroundSubtractorMOG2 / KNN

2.2 Contour Extraction

3.4 · Deep-Dive into cv::CascadeClassifier Methods

1. bool load(const std::string &xmlPath)

2. bool empty() const

3. detectMultiScale — Core Detection Call

3-bis. Extended overload (confidence scores)

4. bool isOldFormatCascade() const

5. bool read(const FileNode& node)

6. Mask-Aware Detection

7. Where Do These XMLs Come From?

4 · HOG-Descriptor with SVM

4.1 Using the Built-in Pedestrian Detector

4.2 Understanding Parameters

5 · Modern Deep-Learning Detectors via cv::dnn

5.1 Loading a MobileNet-SSD (Caffe) Model

5.2 Pre-Processing & Forward Pass

5.3 Parsing Detections

5.4 YOLOv5 / YOLOv8 via ONNX

6 · Tips for Training a Custom Model

7 · Performance Guidelines

8 · Embedding C++ OpenCV Object-Detection in Mobile Apps (Swift & Android / .NET MAUI)

8.1 Bridging C++ → Swift

8.2 Android Studio & .NET MAUI

8.2.1 Compile OpenCV as a Shared Library (.so)

8.2.2 JNI Wrapper

8.2.3 Calling from .NET MAUI (C#)

2.1 Methods `cv::createBackgroundSubtractorMOG2` / `KNN`

3.4 · Deep-Dive into `cv::CascadeClassifier` Methods

1. `bool load(const std::string &xmlPath)`

2. `bool empty() const`

3. `detectMultiScale` — Core Detection Call

4. `bool isOldFormatCascade() const`

5. `bool read(const FileNode& node)`

5 · Modern Deep-Learning Detectors via `cv::dnn`