Our client, a global leader in AI development, faced a major challenge: ensuring their AI models delivered answers that were not just accurate but also factually supported and culturally reliable.
The outputs of generative AI and Large Language Models are complex and often subjective, open to multiple interpretations, and deeply influenced by cultural nuances. Addressing these challenges requires expert human-in-the-loop validation, coordinated by a partner who can source and scale quickly.
Sigma was uniquely equipped to find and train skilled annotators and had proven processes in place to ensure every output was factual and backed by credible sources.
Sigma’s proactive approach to annotation
At Sigma, we constantly strengthen our capabilities and expertise to deliver high-quality data with both speed and precision. Instead of building from scratch, we leverage our deep expertise and lessons learned from our in-house initiatives to provide rapid solutions to our clients’ most challenging requests.
In this case, we were able to jump-start the project as a result of our internal efforts:
- We had already designed and executed three specialized AI evaluation projects, refining our methodologies for complex tasks, such as:
- Attribution: Assessing if AI answers were fully supported by the provided reference texts.
- Accuracy scoring: Evaluating AI sentences against reference passages for precision and cultural relevance.
- Factual rewriting: Reviewing AI-generated sentences against reference passages and rewriting them to ensure factual accuracy, clarity, and grammatical correctness.
- We also developed six customized gen AI skill tests to source and vet annotators with the right abilities for any given assignment, including language logic, researching, and rephrasing.
Implementing a human-in-the-loop validation process
To ensure high-quality, trustworthy AI outputs, we implemented a human-in-the-loop evaluation process tailored to the project’s unique needs.
This multi-step approach included:
- Targeted skill testing: We identified annotators with the precise capabilities required for the project.
- Refining annotation guidelines: We enhanced the original instructions by incorporating case-specific guidance and examples. This helped annotators capture subtle nuances in AI-generated responses, prevent AI bias, and maintain consistency across all evaluations.
- Research, validate, and rewrite: Our annotators followed a structured workflow. They generated research questions, conducted fact-checking using reliable sources, and rated the factual accuracy of the original answer.
Delivering high-quality outcomes at every stage
High-quality data is a non-negotiable for building responsible AI.
Rooted in years of prior experience and proven workflows, Sigma has designed a rigorous evaluation process that combines highly trained human annotators, refined guidelines, and a strong fact-checking framework. The result? High-quality, reliable outcomes delivered with speed and precision. If you’d like to learn more, talk to an expert at Sigma.