728x90

While studying machine learning and deep learning, I frequently came across the term annotation. To better understand its meaning, I decided to organize my thoughts on the topic.


1) What Is Annotation?

By definition, annotation means "adding notes or comments to explain something." In Korean, it is defined as:

"Providing an easy explanation for a word or sentence, or such a written note."

In programming and data contexts, annotation carries a similar meaning. A simple way to understand it is:

  • In programming, annotation is used to add comments to code. For example, in languages like C, comments are written using // to provide explanations for the code. (See Figure 1)
  • In data-related fields, annotation refers to labeling data to describe it. Labeling, as shown in (Figure 2), involves adding metadata to objects recognized in an image. For instance, in an object detection task, bounding boxes are labeled with categories like Car, Person, etc.

The Purpose of Annotation

Annotation helps others—whether programmers or machine learning models—understand code or data more easily.


2) Types of Data Annotation

From a data perspective, annotation can be classified based on how objects are labeled. It is similar to cutting out an object from an image (also known as "object masking").

① Bounding Box

The bounding box is the most common annotation method in object detection. A tight rectangular box is drawn around an object, and a class label is assigned.

  • It is easy to implement since only two diagonal points need to be marked.
  • However, it lacks detail, as it does not perfectly capture the object's shape.

② Polygon

A polygon annotation uses multiple points to outline the exact shape of an object.

  • This method provides high accuracy and detail.
  • However, it requires more effort, making it labor-intensive.

③ Point

In point annotation, objects are marked using single points.

  • It is very easy to implement but limited in capturing object features.
  • This method works well when identifying distinctive key features or counting objects in an image.

④ Keypoint

Keypoint annotation is used when the shape of an object needs to be detected.

  • It combines polygon and point annotation to outline the object.
  • Keypoint metadata includes the number of points and their order, ensuring consistency across similar objects.

⑤ Polyline

A polyline is created by connecting multiple points with a line.

  • This method is useful for detecting continuous structures like roads, lane markings, or other boundary lines.
  • It is commonly used in autonomous driving (ADAS systems) to detect lanes.

⑥ Cuboid

A cuboid annotation extends bounding boxes to 3D.

  • Unlike the 2D bounding box, cuboids provide depth information.
  • This method is widely used in autonomous driving to track vehicles in a 3D space.
  • More detailed metadata leads to better neural network performance.

Conclusion

Understanding these annotation types will help you choose the right labeling method based on the needs of your machine learning model. Selecting the appropriate annotation style is essential for building high-quality datasets.

728x90

+ Recent posts