Image and Video Annotation: More Than Meets the Eye

In addition to annotating images and video for computer vision and other applications, we annotate types of data that aren’t captured by the human eye — like lidar, radar, infrared and x-ray data — for use in diagnostics, maintenance and more.

RGB images
Lidar and Radar
Point Clouds

Data Annotation Services for Image and Video

With our flexible suite of data labeling tools, we support an ever-growing range of methods for image and video annotation. Don’t see a service you need, or have an edge case? Contact us – we love a challenge.

2D and 3D Bounding Boxes

Annotators draw 2D or 3D boxes around objects of interest and label them, for example by type of object or attributes. For 3D boxes, also known as cuboids, annotators need to mark anchor points at the edges of objects to delineate length, width and depth.


To precisely identify irregularly-shaped objects of interest, annotators can use different types of polygons, which they then label by type and attribute. Machine learning models can assist this process by recognizing the object and recommending corner points for the polygon.

Lines and Splines

Annotators draw straight or curved lines that mark boundaries of an object, for example lanes on a road. This type of annotation helps train AI models for autonomous vehicles to detect lanes while driving.

Landmark Annotation

For facial traits, gestures and postures, Annotators draw points on specific features or “landmarks” on the face or body. Computer vision algorithms can use this data to interpret emotions and gestures, recognize people through facial recognition or assess posture changes in sports.

Optical Character Recognition

In order to process text from still images or video, we extract text from images or video frames and convert them into machine-encoded text. These data sets can provide the basis for training systems that interpret traffic signs, license plates, serial numbers, documents and IDs.

Image Classification

For image classification, annotators assign an entire image with one label to categorize it by type. A machine learning system can use this data, for example, to learn to differentiate skin disorders such as melanoma, melanocytic nervus, actinic keratosis, benign keratosis, etc.

Semantic Segmentation

While bounding boxes and polygons annotate certain objects of interest only, semantic segmentation annotates every pixel in the image. This segments the image into categories that are identified by a color code. For example, vehicles are colored blue or pedestrians are colored red.

Video Tracking

Video tracking requires frame-by-frame annotation to help computer vision systems detect, identify and track objects and living beings accurately. After identifying objects of interest with 2D or 3D bounding boxes or semantic segmentation, annotators can label them by type or attribute.

Recommended content


What is Image Annotation?

Image annotation is the process of associating the whole image or parts of an image with a predefined set of labels. Image annotation is frequently used for image classification, image detection, and image segmentation for machine learning and computer vision models.


Pixel-Perfect Annotation for Recognizing Products in Images

A robotics client struggled to get high-quality image data labeled within a 1-pixel tolerance. Sigma.AI’s human-in-the-loop, tech-assisted teams delivered.


Data Preparation 101

An essential part of any machine learning workflow starts with data preparation. This is the process of converting data from a structured or unstructured format into a form that machine learning algorithms can use. Data preparation is essential because it helps to improve the quality of data and make it more consistent.

Let’s Work Together to Build Smarter AI

Whether you need help sourcing and annotating training data at scale, or you need a full-fledged annotation strategy to serve your AI training needs, we can help. Get in touch for more information or to set up your proof-of-concept.