OpenCV in C++ — Hands-On Guide to Object Detection

This follow-up expands upon the foundational tutorial by diving into object-detection techniques. We walk through classical, machine-learning, and deep-learning approaches entirely in C++. Every block can be collapsed for focused study.

1 · What Is Object Detection?

Object detection identifies and localises one or more classes of objects in an image— usually by returning bounding boxes (bb) and class scores. In OpenCV we can:

Below, sections progress from simple background subtraction to state-of-the-art YOLO.

2 · Motion-Based Detection with Background Subtraction

2.1 Methods cv::createBackgroundSubtractorMOG2 / KNN

cv::Ptr<cv::BackgroundSubtractor> backSub =
    cv::createBackgroundSubtractorMOG2();   // or ...KNN()

cv::VideoCapture cap(0);
cv::Mat frame, mask, fg;
while(cap.read(frame)){
    backSub->apply(frame, mask);           // learn + segment
    cv::erode(mask, mask, {}, { -1,-1 }, 1);
    cv::dilate(mask, mask, {}, { -1,-1 }, 2);

    frame.copyTo(fg, mask);                // show foreground pixels
    cv::imshow("FG Mask", mask);
    cv::imshow("Objects", fg);
    if(cv::waitKey(1)==27) break;
}

2.2 Contour Extraction

std::vector<std::vector<cv::Point>> contours;
cv::findContours(mask, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
for(const auto &c : contours){
    if(cv::contourArea(c) < 500) continue; // ignore small blobs
    cv::Rect bb = cv::boundingRect(c);
    cv::rectangle(frame, bb, {0,255,0}, 2);
}

3.4 · Deep-Dive into cv::CascadeClassifier Methods

Below is a concise reference for the most-used public methods of cv::CascadeClassifier. Each entry lists the signature, purpose, and a small code snippet to cement the idea. (All snippets assume you have already included <opencv2/objdetect.hpp>.)

1. bool load(const std::string &xmlPath)

Instantiates the cascade from a trained XML. Returns true on success. If you ship your model inside the bundle (iOS) or assets/ (Android), extract its absolute path first.

cv::CascadeClassifier cascade;
if(!cascade.load("haarcascade_frontalface_default.xml")){
    throw std::runtime_error("XML not found!");
}

2. bool empty() const

Quick guard to verify the classifier is ready:

if(cascade.empty()) { /* handle error */ }

3. detectMultiScale — Core Detection Call

SignatureMeaning (defaults)
void detectMultiScale( InputArray img, std::vector<Rect>& objects, double scaleFactor = 1.1, int minNeighbors = 3, int flags = 0, Size minSize = Size(), Size maxSize = Size() )
  • img — 8-bit grayscale frame.
  • scaleFactor — how aggressively to build the image pyramid (1.05 ⇒ finer search; 1.4 ⇒ faster).
  • minNeighbors — object validates only if clustered at least N times (higher ⇒ fewer false positives).
  • flags — legacy; supply 0 | CASCADE_SCALE_IMAGE.
  • minSize / maxSize — prune windows by absolute size.

Example — adjustable sensitivity slider

double sf   = cv::getTrackbarPos("Scale",  win)/100.0 + 1.05;
int    neigh = cv::getTrackbarPos("Nbs",    win);
cascade.detectMultiScale(gray, faces, sf, neigh);

3-bis. Extended overload (confidence scores)

void detectMultiScale(
    InputArray img,
    std::vector<Rect>& objects,
    std::vector<int>& rejectLevels,
    std::vector<double>& levelWeights,
    double scaleFactor = 1.1,
    int minNeighbors   = 3,
    int flags          = 0,
    Size minSize       = Size(),
    Size maxSize       = Size(),
    bool outputRejectLevels = true );

Handy when you need a score per box (e.g. to draw a coloured heatmap):

std::vector<Rect> boxes;
std::vector<int>  lvl;
std::vector<double> w;
cascade.detectMultiScale(gray, boxes, lvl, w, 1.1, 3, 0, cv::Size(), cv::Size(), true);
for(size_t i=0;i<boxes.size();++i){
    cv::Scalar c = w[i] > .7 ? cv::Scalar(0,255,0) : cv::Scalar(0,165,255);
    cv::rectangle(frame, boxes[i], c, 2);
}

4. bool isOldFormatCascade() const

Tests if the loaded XML stems from the legacy Viola-Jones format (pre-OpenCV 2). Rarely needed—useful when migrating historical datasets.

5. bool read(const FileNode& node)

Deserialises from an .yml/.json node (when you embed the model inside a larger OpenCV cv::FileStorage). Example:

cv::FileStorage fs("models.yml", cv::FileStorage::READ);
cascade.read(fs["face_cascade"]);

6. Mask-Aware Detection

When you wish to ignore parts of the frame (e.g. UI overlays), supply a custom MaskGenerator.

struct BlackCorners : cv::BaseCascadeClassifier::MaskGenerator{
    cv::Mat generateMask(const cv::Mat& src) override{
        cv::Mat mask(src.size(), CV_8UC1, cv::Scalar(255));
        cv::rectangle(mask, {0,0,src.cols,40}, cv::Scalar(0), cv::FILLED);
        return mask;
    }
};
cascade.setMaskGenerator(cv::makePtr<BlackCorners>());

During detectMultiScale, windows fully inside black areas will be skipped, reducing false detections in HUDs or top-bar overlays.

7. Where Do These XMLs Come From?

OpenCV’s cascade XMLs are trained with opencv_traincascade. You feed it thousands of positive patches + negatives. The tool performs Ada-Boosted feature selection, producing a multi-stage classifier that the above methods execute in real-time on CPU.

4 · HOG-Descriptor with SVM

4.1 Using the Built-in Pedestrian Detector

cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());

std::vector<cv::Rect> persons;
hog.detectMultiScale(frame, persons, 0, cv::Size(8,8),
                     cv::Size(32,32), 1.05, 2);
for(auto &bb : persons)
    cv::rectangle(frame, bb, {0,0,255}, 2);

4.2 Understanding Parameters

ParamMeaning
hitThresholdDecision margin of SVM. Lower → more detections.
winStrideSliding-window step size.
paddingGaussian padding around borders.
scalePyramid scaling between levels.
groupThresholdNeighbour merges (like minNeighbors).

5 · Modern Deep-Learning Detectors via cv::dnn

5.1 Loading a MobileNet-SSD (Caffe) Model

auto net = cv::dnn::readNetFromCaffe("deploy.prototxt",
                                     "mobilenet_iter_73000.caffemodel");
net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);  // or CUDA
net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);       // or DNN_TARGET_CUDA

5.2 Pre-Processing & Forward Pass

cv::Mat blob = cv::dnn::blobFromImage(frame, 1.0/127.5,
                     cv::Size(300,300), cv::Scalar(127.5,127.5,127.5), true, false);
net.setInput(blob);
cv::Mat out = net.forward();   // shape: [1,1,N,7]

5.3 Parsing Detections

float *data = (float*)out.data;
for(size_t i=0; i<out.total(); i+=7){
    float conf = data[i+2];
    if(conf < 0.5) continue;
    int x1 = (int)(data[i+3]*frame.cols);
    int y1 = (int)(data[i+4]*frame.rows);
    int x2 = (int)(data[i+5]*frame.cols);
    int y2 = (int)(data[i+6]*frame.rows);
    cv::rectangle(frame, {x1,y1,x2-x1,y2-y1}, {0,255,0}, 2);
}

5.4 YOLOv5 / YOLOv8 via ONNX

Convert YOLO model to ONNX → load with readNetFromONNX. Outputs require Non-Max Suppression:

std::vector<int> idx;
cv::dnn::NMSBoxes(boxes, scores, /*scoreThresh=*/0.25, /*nmsThresh=*/0.45, idx);
for(int i : idx) draw_box(boxes[i]);

6 · Tips for Training a Custom Model

7 · Performance Guidelines

  1. Backend & Target: Select DNN_BACKEND_CUDA + DNN_TARGET_CUDA_FP16 when a GPU is available.
  2. Batching: Run multiple frames per forward on the GPU if latency allows.
  3. TensorRT: For NVIDIA SoCs, compile OpenCV with TensorRT for massive acceleration.
  4. Threading: Encapsulate detection in worker threads to avoid UI blocking.
  5. Zero Copy: On embedded systems, keep data on GPU when possible.

8 · Embedding C++ OpenCV Object-Detection in Mobile Apps (Swift & Android / .NET MAUI)

8.1 Bridging C++ → Swift

  1. Add OpenCV framework
    File ▸ Add Packages… → search opencv2.framework (or compile & drag-in).
  2. Create an Objective-C++ wrapper
    // DetectorWrapper.mm  (⚠️ .mm)
    #include <opencv2/opencv.hpp>
    extern "C" UIImage * detectObjects(UIImage *imgIOS){
        cv::Mat frame;
        UIImageToMat(imgIOS, frame);               // from cv::imgcodecs
        static cv::HOGDescriptor hog(cv::HOGDescriptor::getDefaultPeopleDetector());
        std::vector<cv::Rect> people;  hog.detectMultiScale(frame, people);
        for(auto &bb: people) cv::rectangle(frame, bb, {0,255,0}, 2);
        return MatToUIImage(frame);                // helper in OpenCV
    }
  3. Expose to Swift via a bridging header
    // DetectorWrapper.h   (added to Bridging Header)
    #import <UIKit/UIKit.h>
    UIImage * _Nullable detectObjects(UIImage * _Nonnull imgIOS);
  4. Call from SwiftUI/View Controller
    let processed = detectObjects(uiImage)

Notes 🛈 : keep UI work on the main thread and heavy OpenCV processing on a background DispatchQueue to avoid UI stutter.

8.2 Android Studio & .NET MAUI

8.2.1 Compile OpenCV as a Shared Library (.so)

# CMakeLists.txt (excerpt)
add_library( cvdetect SHARED Detector.cpp )
find_package( OpenCV REQUIRED )
target_link_libraries( cvdetect ${OpenCV_LIBS} )

8.2.2 JNI Wrapper

// DetectorJNI.cpp
#include <jni.h>
#include <opencv2/opencv.hpp>

extern "C"
JNIEXPORT jintArray JNICALL
Java_com_example_ObjDet_detect(JNIEnv *env,jobject, jlong addr){
    cv::Mat &frame = *(cv::Mat*)addr;
    static cv::HOGDescriptor hog(cv::HOGDescriptor::getDefaultPeopleDetector());
    std::vector<cv::Rect> boxes; hog.detectMultiScale(frame, boxes);

    jintArray out = env->NewIntArray(boxes.size()*4);
    jint *buf = env->GetIntArrayElements(out,nullptr);
    for(size_t i=0;i<boxes.size();++i){
        auto bb=boxes[i]; int j=i*4;
        buf[j]=bb.x; buf[j+1]=bb.y; buf[j+2]=bb.width; buf[j+3]=bb.height;
    }
    env->ReleaseIntArrayElements(out,buf,0);
    return out;
}

8.2.3 Calling from .NET MAUI (C#)

using System.Runtime.InteropServices;

public partial class Detector {
    [DllImport("cvdetect", EntryPoint="Java_com_example_ObjDet_detect")]
    private static extern IntPtr Detect(IntPtr matAddr);

    public static Rect[] Run(SKBitmap bitmap){
        // Convert SKBitmap → cv::Mat via OpenCV for Unity or AOT-friendly helpers
        IntPtr matAddr = /* ... */;
        IntPtr arr = Detect(matAddr);
        // marshal jintArray → Rect[]
    }
}

With these wrappers, your cross-platform MAUI UI remains in C# while the heavy lifting executes inside an optimised C++ OpenCV library.