Object Detection

Object detection is the computer vision task of identifying and localising all instances of specified object classes in an image or video frame. The output is typically a set of bounding boxes — each with a class label and a confidence score. Detection combines **classification** (what is this?) with **localisation** (where is it?).

## The detection taxonomy

**Two-stage detectors** first generate region proposals (candidate bounding boxes likely to contain objects), then classify each proposal. Architecturally separate, generally more accurate.
- R-CNN (2014) → Fast R-CNN (2015) → Faster R-CNN (2015, with RPN): the foundational two-stage family.

**Single-stage detectors** predict class labels and bounding boxes in a single forward pass. Faster but historically less accurate on small objects.
- YOLO (You Only Look Once, 2016) and successive generations (YOLOv5, YOLOv8, YOLOv10): the dominant real-time detection family.
- SSD (Single Shot Detector): multi-scale detection from multiple feature map layers.
- RetinaNet: introduced focal loss to address class imbalance in dense detection.

**Transformer-based detectors** apply attention across image patches:
- DETR (2020): end-to-end detection without NMS (non-maximum suppression); reframes detection as a set prediction problem.
- Grounding DINO: open-vocabulary detection from natural language prompts — "find all yellow hard hats in the image."

## Evaluation metrics

**mAP** (mean Average Precision) is the standard detection metric. It averages precision-recall area under curve across all classes and across IoU thresholds (typically 0.5 and 0.5-0.95). Higher mAP means fewer missed objects and fewer false alarms. For latency-sensitive applications, frames per second (FPS) at a given accuracy level is the relevant metric.

## Applications in APAC enterprise contexts

Object detection is a production technology across AIMenta's target sectors:

- **Manufacturing**: real-time defect detection on production lines; presence/absence checks for assembly components; PPE compliance monitoring (hard hat, vest, glove detection).
- **Logistics**: package identification at high-speed conveyor belts; vehicle damage detection at fleet inspection; inventory counting in warehouse aisles via camera.
- **Retail**: shelf stock monitoring (out-of-stock detection); customer flow analysis; loss prevention (abandoned item detection at self-checkout).
- **Construction and facilities**: safety compliance monitoring; progress documentation via periodic aerial imagery.

## Deployment considerations

- **Real-time requirements**: YOLOv8n (nano) runs at 200+ FPS on a consumer GPU at >37 mAP on COCO. For 30fps inline production monitoring, use a quantised YOLOv8 variant on an embedded GPU (NVIDIA Jetson, Hailo-8).
- **Custom classes**: pre-trained models cover the 80 COCO classes (person, car, bottle, etc.). For industrial defect types or specific product SKUs, fine-tune on labelled examples. 200-500 labelled images per class is often sufficient with transfer learning.
- **Video vs image**: tracking algorithms (SORT, ByteTrack, DeepSORT) extend frame-level detection to persistent object IDs across video — required for counting, dwell-time analysis, and trajectory monitoring.

Where AIMenta applies this

Service lines where this concept becomes a deliverable for clients.

service Software & Platforms

Beyond this term

Where this concept ships in practice.

Encyclopedia entries name the moving parts. The links below show where AIMenta turns these concepts into engagements — across service pillars, industry verticals, and Asian markets.

Other service pillars

AI Strategy & Advisory Training & Enablement Talent & Hiring Workflow Automation Infrastructure & Cloud

By industry

Financial services Retail & e-commerce Manufacturing Logistics Healthcare Professional services Public sector Real estate Technology Education

By Asian market

🇭🇰 Hong Kong 🇨🇳 Mainland China 🇹🇼 Taiwan 🇯🇵 Japan 🇰🇷 Korea 🇸🇬 Singapore 🇲🇾 Malaysia 🇻🇳 Vietnam 🇮🇩 Indonesia

Continue with All terms · AI tools · Insights · Case studies