Collecting and facilitating natural conversations in specific dialects

Recording two people casually playing a game might not sound like a challenge, but when a big 5 global technology service provider needed over 1000 conversations between unique pairs of speakers of very specific regional dialects delivered in just 2 months, figuring out how to facilitate recordings became the name of the game. Sigma.AI pivoted quickly and built a custom, automated game coordination tool to efficiently match pairs in a short period of time.

1160

Recorded conversations of unique pairs of regional dialect speakers

25%

Project management time saved through custom tooling

2

Months turnaround time for dataset delivery

Challenge

  • Record 1000+ natural, fluid conversations between two participants
  • Source participants who are native speakers of highly specific regional dialects, currently living at least 10 years in that region
  • Many participants needed, of diverse ages and genders
  • Client provided a set of games as prompts, e.g. describing an image or a word
  • Pairs of participants could not repeat — each of the conversations needed to be unique and between different people

Solution

  • Tapped into existing pool of 25,000+ annotators and linguists to quickly source participants for specific regional dialects
  • Trained participants for the project including video tutorials integrated into the user interface
  • Advised client on optimal distribution of not only gender and age but also vocal qualities and speed of speech within dataset
  • Unified several user interfaces into one to help foster more natural conversations
  • Optimized on-the-ground participant workflows
  • Created a custom tool for participant matching to maximize matching unique pairs according to their availability

Project story

When a major global technology player was working on a natural language understanding project, they needed recordings of real, fluid conversations — they needed over 1000 conversations, and they needed them fast, with a delivery timeframe of just two months. Their concept was to provide speakers with games to play as prompts, where participants would have to describe an image or a word to their conversation partner. This way, the conversation would have a natural flow and wouldn’t be limited by a script. 

Finding the optimal balance of diverse speakers

The first challenge was sourcing a large number of speakers from very specific dialects — not only native speakers, but native speakers who had lived in regions where that dialect is spoken for at least 10 years. Collecting recordings of players with a wide range of ages and genders was also a requirement.

Sigma was able to tap into their existing pool of over 25,000 annotators and linguists to quickly source speakers of the requested dialects with the required local experience. But a few questions about the dataset remained: with deadlines so tight, how can the dataset include the right balance of ages and genders needed to train the algorithm for all possible circumstances? Sigma consulted with the client on the optimal gender and age balance within the dataset, maximizing the diversity of the speakers to train the AI with the broadest dataset possible within real-world practical limitations. Additionally, thanks to their deep experience in speech and language AI, they could advise the client on including factors such as diverse vocal pitch and timbre as well as different speaking speeds in the dataset. 

Project and product teams collaborate to deliver at scale

When Sigma tackled this project, they knew optimizing tools and processes within the project workflow would be the key to delivering at scale, and on time. Creating fluid workflows took concerted effort from project managers and product development. Sigma’s approach of customizing tools, teams and processes to unique project requirements meant they were prepared to adapt all aspects of the project to satisfy the client’s needs. 

Project managers, in constant feedback loops with the client and participants, were able to identify potential hurdles often before they happened, raising them to the product team where appropriate. When project managers saw that speakers would have to switch between three different user interfaces to play the game, record themselves, and review participation rules and training, the product development team worked fast to integrate everything into one seamless user interface. Project managers created video tutorials to train participants and uploaded them directly into the interface. These process and technical adaptations made participation much easier for the speakers, leading to quicker recording set-up and more natural conversations between participants. 

Automated pair matching saves 25% project management time

A bigger challenge emerged early in the data collection process — how do you coordinate gameplay between a large number of participants with different availability and schedules when each conversation needs to be between new, unique pairs? And how do you motivate participants to work on the project when each conversation is only 45 minutes? 

The most essential piece of the optimization puzzle was automating the process of matching unique pairs of players together. The product teams created a custom interface to match participants’ availability and automatically send them calendar invitations, optimizing the most labor-intensive part of the project management process and saving 25% of the total project management time. 

This also solved the question of motivation – participants could set their own schedules, lose no time on overhead and coordination, and could be sure that their next conversation would happen as soon as possible. It also greatly improved the speakers’ satisfaction with the project and continued participation, because their game partners were more likely to show up consistently and on-time thanks to the efficient, automated meeting planner.

A major technology services client needs 2000 hours of video in 24 languages transcribed by humans — and wants to launch all 24 teams at once.
A client in consumer hardware needed highly sensitive user data annotated. Sigma.AI designed, implemented & operated secure facilities for 400+ annotators.
A robotics client struggled to get high-quality image data labeled within a 1-pixel tolerance. Sigma.AI’s human-in-the-loop, tech-assisted teams delivered.
EN