Working on a running system is an agile process by necessity — when customers are actively using a system, it’s not always possible to stop everything to make major changes. This was the case for a global consumer hardware company when working on a search algorithm. Because language and cultural context are alive and constantly changing, they needed to make continuous improvements to their model. Opting for smaller, incremental changes, they required a team of data annotators that could work closely with their in-house engineers and react quickly as they made iterations to the algorithm in real time.
Annotator team flexibility supports iterative model changes
The annotators’ objective: support the engineering team’s efforts to improve the search algorithm by evaluating the relevance of search queries and results. This involved segmenting, classifying and assessing various aspects of queries and their results, switching often between different annotation tasks as the engineers tweaked the model. One day they might be identifying named entities like film titles, people or brands in search queries, another day they might be categorizing search queries into topics and sub-topics, and the next day testing searches to see if the corresponding results are relevant to the query.
Sigma sourced a core team of 128 annotators — all native speakers of German, Spanish and French — who would be ready to jump from one task to the next at a moment’s notice. This level of flexibility, depth of understanding of the project requirements, and shared mindset about collaboration and customer orientation, can only be achieved with long-term, vetted and trained annotators — an approach Sigma relies on exclusively.
Precise annotation guidelines create consistency
When a search engine seems to know exactly what you want, it’s likely that a human annotator helped train it to understand the nuances and context of what you’re requesting. But not all questions have a single, clear answer — and broader answers can apply to many questions. So when evaluating search queries and results, it’s also important that annotators have clear, precise guidelines to make decisions.
For example, what would a request like “bold accessories” mean? Does “bold” refer to colors, shape, or size? Bold compared to what current standard? What kind of accessories are implied, how are they worn? These questions are both highly cultural and also imprecise, so they require not only a human annotator to apply their own judgment and context, but also precise guidelines to assure consistency.
The toughest challenge in this client’s case was finding a way to align the annotators’ decision-making when they needed to identify and classify topics and sub-topics of queries. Sigma’s solution was to develop a step-by-step reasoning framework where annotators would consistently arrive at the same conclusion by following the same logic steps. The teams applied the same reasoning framework method to assessing the level of relevance of search results.
Constant feedback keeps annotators fresh (and quality up)
Having such detailed guidelines — and keeping annotators on their toes while switching projects daily — made the process of training annotators particularly significant. All annotators needed to learn the guidelines and processes for each of the different tasks, and were required to deliver labeled data consistently under a stringent maximum error rate. This was no small feat, considering the complexity of interpreting queries and results, and when tasks changed day-by-day.
Meeting these quality standards was only possible through rigorous training processes and Sigma’s unique approach to constant feedback loops. After initial training, Sigma tested annotators on guidelines before they started a new task. Through constantly monitoring results, Sigma would see when annotators became less attentive, as people naturally do after repeating tasks over longer periods. When this happened, they were temporarily removed from the task and refreshed on the guidelines before returning to annotation. This way, Sigma was able to not only flag issues as they arose, but consistently deliver the annotated data at the highest quality by building constant quality improvements into the process.