How a big tech company is cracking the nuances of human voice with Sigma’s skills-based approach

For a complex and detailed audio description project from a leading global technology company, Sigma’s specialized skills in sourcing, vetting, and training annotators were essential. Faced with a tight two-week window, Sigma successfully built a team and completed the initial phase, which has since scaled to over 20,000 annotations.

3

skills tests created to source annotators

12

annotators sourced and trained in just two weeks

20,000+

high-quality annotations delivered to date

Challenge

  • Sourcing annotators with a specialized skill set for capturing subtle vocal nuances (tone, intonation, intention, emotion).
  • Meeting a demanding two-week timeframe for recruitment, training, and project initiation.
  • Ensuring culturally accurate descriptions by sourcing US-native annotators to mitigate potential biases.

Solution

  • Developed and implemented three specialized skills tests (creativity, paraphrasing, voice characteristics) to evaluate annotators.
  • Successfully sourced and trained a team of 12 highly skilled annotators within a tight two-week timeframe.
  • Delivered over 20,000 high-quality, detailed, and nuanced audio annotations to date.
  • Leveraged existing expertise and established data annotation processes to source, vet, and train candidates.

Project story

For a generative AI project, the client required detailed descriptions of audio recordings that emphasized the unique vocal characteristics of each speaker, including tone, intonation, intention, and emotion. Unlike traditional transcription projects, the focus shifted from content to capturing the subtle nuances of human voice. The goal was to create precise, varied, and vocabulary-rich descriptions that describe an individual’s distinctive speech pattern. 

The client presented Sigma with an ambitious two-week deadline. This involved recruiting a team of highly skilled candidates, training them, and executing the first phase of the annotation project in a very tight timeframe. 

The main challenge was identifying the precise skills required for the project and sourcing the most appropriate individuals for the task through a rigorous recruitment and evaluation process.  

Evaluating gen AI’s annotation skills with Sigma’s proven method

Sigma’s expertise in scaling human data annotation proved crucial for the success of this project.  

The first step involved defining the specific skills required for annotators:

  • Creativity: Possessing a rich vocabulary and demonstrating the ability to use diverse adjectives to accurately describe vocal nuances. 
  • Paraphrasing expertise: Crafting sentences with diverse structures and vocabulary choices.
  • Sensitivity to vocal nuances: Capturing subtle nuances in voice, intonation, and vocal characteristics.
  • Cultural background: Requiring US-native annotators to guarantee culturally accurate descriptions and mitigate potential biases.

To evaluate candidates, Sigma leveraged two skills tests that were developed, validated, and implemented in previous projects, as part of the company’s upskilling initiatives to prepare for gen AI challenges. One test assessed candidate creativity, while the other one evaluated their paraphrasing expertise.

However, the unique characteristics of this new project required Sigma to develop an additional test to evaluate the capacity of an annotator to identify multiple nuances in the human voice. This “voice characteristic test” identified the most suitable candidates, based on their abilities to identify tone and intonation and convey emotions through language.

Candidates who demonstrated strong performance on the skills tests then proceeded to interviews designed to assess crucial soft skills, particularly communication, essential for the success of the project, which was entirely remote.

After signing contracts, a team of twelve annotators was introduced to the project and provided with detailed annotation guidelines. The guidelines included examples of the tasks they needed to perform. However, given the project’s emphasis on creativity and the generation of new information, the guidelines were intentionally concise to encourage independent thought and enable annotators to use their intuition and natural language experience to describe vocal characteristics.

To ensure high-quality annotations, the project manager conducted a rigorous review after each round of annotations and provided feedback to each of the team members. 

Delivering over 20K high-quality annotations (and counting!)

  • Sigma was able to source, vet, and train a team of twelve highly skilled annotators in just a few days.
  • After a successful initial phase of two weeks, the project is still in progress. Sigma’s annotation team has now completed over 20,000 descriptions.

Sigma’s existing experience and processes were instrumental in rapidly identifying the most suitable candidates for the project, evaluating their skill sets, providing specific training, and quickly scaling up the project. 

Generative AI projects demand innovation, creativity, and an ability to embrace novel challenges. This project was successful because of Sigma’s proven expertise in similar projects, the right mix of skilled people, well-defined processes, and strong ethical values. Working together, we were able to deliver on time and on budget while significantly accelerating a unique new project.

A major technology services client needs 2000 hours of video in 24 languages transcribed by humans — and wants to launch all 24 teams at once.
A client in consumer hardware needed highly sensitive user data annotated. Sigma.AI designed, implemented & operated secure facilities for 400+ annotators.
Need pixel-perfect image annotation? Learn how Sigma's human-in-the-loop team helped a robotics client achieve 1-pixel tolerance labeling.
Pixel-perfect image annotation for product recognition: A case study
EN