Innovation by DX

IT for Society

Artificial Intelligence (AI)

Lesson 30  Object Detection Basics

Object Detection is a technology to find a rectangle area which surrounds an object included in a image. The detected area is called Bounding Box. A variety of methods have been proposed, and most of them follow the steps below.

1.  Object Area Proposal

The system proposes muliple bounding boxes for objects.

In the example above, the system proposes a bounding box first, and moves it with a specific span to create multiple items. This algorithm is called Sliding Window Method This is easy to understand, but the problem is its high calculation load for large size image files.

The other methods have been proposed to solve that issue. Please refer to the other books or documents for the details.

2.  Classification

Classify the extracted image from the proposed bounding box. If you are trying to classify a train, the algorithm is expected to calculate the probability of train detection in the bounding box.

3.  Non-maximal Suppression

Multiple bounding boxes may be calculated on one specific object to be detected. In order to remain only one box, the algorithm calculates the probability on each box, and keep only one result with the maximum value. This process is called 'Non-maximul suppression'.

Suppose you got three bounding boxes as follows.

The maximum probability is 90%. Therefore, we will keep the item with the value of 90% only.

Eventually, you got one bounding box whose probability is adequately large.

Recently, we often adopt deep learning technology for the whole object detection process. But it is still valid to know the basic method above, in order to understand the details of the calculation deeply.

To Next Article

To Contents Page