Capturing vocal nuances for gen AI: A skills-based approach

For a generative AI project, the client required detailed descriptions of audio recordings that emphasized the unique vocal characteristics of each speaker, including tone, intonation, intention, and emotion. Unlike traditional transcription projects, the focus shifted from content to capturing the subtle nuances of human voice. The goal was to create precise, varied, and vocabulary-rich descriptions that describe an individual’s distinctive speech pattern.

The client presented Sigma with an ambitious two-week deadline. This involved recruiting a team of highly skilled candidates, training them, and executing the first phase of the annotation project in a very tight timeframe.

The main challenge was identifying the precise skills required for the project and sourcing the most appropriate individuals for the task through a rigorous recruitment and evaluation process.

Evaluating gen AI’s annotation skills with Sigma’s proven method

Sigma’s expertise in scaling human data annotation proved crucial for the success of this project.

The first step involved defining the specific skills required for annotators:

Creativity: Possessing a rich vocabulary and demonstrating the ability to use diverse adjectives to accurately describe vocal nuances.
Paraphrasing expertise: Crafting sentences with diverse structures and vocabulary choices.
Sensitivity to vocal nuances: Capturing subtle nuances in voice, intonation, and vocal characteristics.
Cultural background: Requiring US-native annotators to guarantee culturally accurate descriptions and mitigate potential biases.

To evaluate candidates, Sigma leveraged two skills tests that were developed, validated, and implemented in previous projects, as part of the company’s upskilling initiatives to prepare for gen AI challenges. One test assessed candidate creativity, while the other one evaluated their paraphrasing expertise.

However, the unique characteristics of this new project required Sigma to develop an additional test to evaluate the capacity of an annotator to identify multiple nuances in the human voice. This “voice characteristic test” identified the most suitable candidates, based on their abilities to identify tone and intonation and convey emotions through language.

Candidates who demonstrated strong performance on the skills tests then proceeded to interviews designed to assess crucial soft skills, particularly communication, essential for the success of the project, which was entirely remote.

After signing contracts, a team of twelve annotators was introduced to the project and provided with detailed annotation guidelines. The guidelines included examples of the tasks they needed to perform. However, given the project’s emphasis on creativity and the generation of new information, the guidelines were intentionally concise to encourage independent thought and enable annotators to use their intuition and natural language experience to describe vocal characteristics.

To ensure high-quality annotations, the project manager conducted a rigorous review after each round of annotations and provided feedback to each of the team members.

Delivering over 20K high-quality annotations (and counting!)

Sigma was able to source, vet, and train a team of twelve highly skilled annotators in just a few days.
After a successful initial phase of two weeks, the project is still in progress. Sigma’s annotation team has now completed over 20,000 descriptions.

Sigma’s existing experience and processes were instrumental in rapidly identifying the most suitable candidates for the project, evaluating their skill sets, providing specific training, and quickly scaling up the project.

Generative AI projects demand innovation, creativity, and an ability to embrace novel challenges. This project was successful because of Sigma’s proven expertise in similar projects, the right mix of skilled people, well-defined processes, and strong ethical values. Working together, we were able to deliver on time and on budget while significantly accelerating a unique new project.

How a big tech company is cracking the nuances of human voice with Sigma’s skills-based approach

3

12

20,000+

Challenge

Solution

Project story

Evaluating gen AI’s annotation skills with Sigma’s proven method

Delivering over 20K high-quality annotations (and counting!)

Building trustworthy AI: How Sigma ensures reliable AI outputs

Launching and scaling video transcription in 24 languages and dialects

Designing secure facilities for sensitive data annotation

Let’s work together to build smarter AI

Services

Resources

Company

Connect