Pixel-perfect annotation for recognizing products in images

A client in the robotics industry struggled to crack the quality issue in a particularly tricky training data project. They needed to find a team that could precisely label products within a 1-pixel tolerance in a complicated image dataset where objects were often crowded together, overlapped, or were soft objects that could change shapes. And they were under considerable time pressure. Sigma.AI’s combination of expert on-staff annotators, and experience with optimizing tools and workflows, helped them scale up quickly and leave the client 100% satisfied with pixel-perfect labeled data.

1

Pixel maximum allowed error on annotation of highly complex images

16

Unique product attributes to annotate and categorize

6480

Hours of QA, annotator and project manager time saved through workflow automation

Challenge

  • Curate dataset to select images that most efficiently train the machine learning model
  • Deliver a constant stream of annotated images with outlines no more than 1 pixel off
  • Precisely delineate products that are frequently crowded together and overlapping, often with dozens of objects per image
  • Categorize products by 16 unique attributes

Solution

  • Developed an algorithm to pre-filter the dataset to select images that would contribute the most new information to the machine learning model
  • Consult client on guideline adaptations to provide a consistent, easy-to-use basis for annotator work
  • Quickly source, train and scale a curated image annotation team
  • Cut turnaround times while maintaining pixel-perfect annotation through custom automation tools and workflow optimizations

Project story

Recognizing objects is a complex process. As humans, we can rely on years of casual practice and implicit understanding when we pick up, say, a mug versus a glass from a table. While we might not remember when we learned the difference, it takes explicit training for an AI to know how to make the distinction — and it needs to happen in a much shorter period of time. 

A client in the robotics industry was developing an AI algorithm for a robotic arm to identify a variety of products in different kinds of packaging, labeling them by size and shape, material type of packaging, certain packaging features like lids or QR codes, and position. An additional challenge: the products needed to be identifiable even when they’re overlapping, only partly visible or inside of soft packaging that could change shape. The annotations had to be pixel-perfect — the client allowed no more than 1 pixel of deviation from an object’s actual shape. Sigma to deliver at impeccably high quality, and the client needed them to deliver fast. 

Custom pre-filtering algorithm to choose relevant images automatically

Before labeling could begin, Sigma needed to support the client in filtering their dataset to select the top 30% of images that would be most effective at training the algorithm. This meant identifying which images contained the most new information that would help the AI algorithm decide on differences in packaging, position, size and shape, and learn to identify the same object if it’s in different positions or when its shape changes due to soft packaging. For example, an image of four bottles neatly in a row and five bottles neatly in a row won’t offer enough variation — but if the picture of five bottles has a few fallen over or out of line, this helps the AI learn to identify different positions of the same object.

If the comparison and filtering were done manually, not only would it take an enormous amount of time — it would be impossible to compare every piece of data with every other piece of data, because humans are only able to compare a subset of the data at once. 

The engineering team at Sigma set to work building a custom algorithm to pre-filter the needed 30% of the dataset according to the client’s specifications. They automated the image selection process using signal processing technology, which allowed them to compare all images simultaneously and choose the most “interesting” ones. Signal processing transforms the image so that data comparison is less difficult, reducing the image to the most relevant information for selection. This could involve, for example, removing the object from a complex background so that the object is more easily identifiable. 

Freeing up annotators to focus on precision work

When annotators label images, they delineate the outlines of an object with points and line segments, known as polygons. Many of the objects were soft or had curved edges, which often required hours of concentrated focus while the annotator built the polygon with minuscule line segments. Or, only a small portion of a package was visible because of overlap, and the annotator had to estimate the size, material and position of the rest of the object based on context clues — things that only a human annotator can achieve.

To meet the client’s required rate of 220 images per day, at the level of accuracy and refinement they needed, the annotation team needed to reach productivity and scale quickly. They first needed to source a group of annotators with the right skills and mindset to do such incredibly detailed labeling. As Sigma maintains a pool of 25,000+ trained and vetted annotators, it was possible to find the right people and build the team on short notice. 

Product managers and the engineering team then needed to figure out ways to shave off time and complexity on several fronts, and free up annotators to do precision work without getting caught up in process details. 

Guidelines adapted to annotator needs

The client knew what results they needed to train the algorithm they built — they had drafted precise guidelines based on the requirements of the algorithm. But many of the guidelines needed refinement so annotators could use them effectively. Sigma’s team could bring in their decades of experience working with annotation teams to help the client adapt their guidelines so that they clearly described the reasoning processes used to label objects. They also replaced specific technical language with terms that annotators typically use, so that the annotators could easily implement the guidelines without confusion or delays.  

Increasing speed through workflow automations

From many years of experience on labeling projects with fast turnaround times and extremely large scale, the project managers knew that work would be much simpler for their annotators if they could work in a customized tool that reduced task complexity and integrated all processes into one streamlined interface. The engineering team customized an open source tool to be able to reference the guidelines and training materials while they annotated. They added purpose-built keyboard shortcuts and the exact labels annotators would need so that they could move faster. 

The engineering team also designed the interface so client inputs would be part of one seamless workflow. First, the clients could access the tool to directly upload packets of image data. Even more time savings came from integrating the review and feedback process into the tool — the client would receive a notification when results were delivered, and could then view the labeled data and leave feedback for annotators directly. 

These workflow automations led to a total of 6480 working hours saved between quality assurance, project management and annotation — the equivalent of nine months of work, over the course of a year-long project. 

Every Pixel In Place

The client was excited about the quality and precision of annotation, which exceeded the competition — they had no further corrections at all to the annotated data after it was delivered. The pace at which Sigma was able to build, train, and scale the team up to production capacity was also a major factor. For the next iteration of the project, they not only requested that Sigma not only continue to work with them, but also double the team’s capacity and test a video pilot.

A major technology services client needs 2000 hours of video in 24 languages transcribed by humans — and wants to launch all 24 teams at once.
A client in consumer hardware needed highly sensitive user data annotated. Sigma.AI designed, implemented & operated secure facilities for 400+ annotators.
How do you coordinate 1000+ conversations between unique pairs of specific dialect speakers in just 2 months? With automation & the right pool of linguists.
EN