AI and machine learning algorithms are built into an increasing number of products — all of which need training data to learn to solve the problems they’re designed for. For many applications, that means collecting and processing user data.
User data is also an extremely valuable resource. First-party data collected while someone uses a product, like behavioral data, chat content or search queries is unique, takes considerable time to collect, and is extremely useful to better understand customers. It can be a significant competitive advantage, especially when it’s used to train an AI — for example to interact better with users and continuously improve the product itself.
All of these were reasons for a major consumer hardware manufacturer to ask Sigma to design and implement secure facilities and procedures and provide annotations of their users’ data exclusively onsite.
Dedicated Annotators Provide Better Security
Sigma was already the client’s annotation provider of choice for this project because of their focus on quality, quick ramp-up times, and flexibility in handling shifting project requirements. Another aspect of Sigma’s approach was crucial for this particularly sensitive project: their policy of vetting, training and hiring annotators directly, and never crowdsourcing.
Crowdsourcing annotation, apart from data quality concerns, is also often remote. This leads to enormous complications when it comes to assuring data security. Even with the best cybersecurity measures in place, there’s still no way to control physical access to the data from remote teams — crowdsourced annotators working from personal machines could, for example, easily screenshot data, or accidentally show sensitive data to someone else in the room. Having annotators on the team that are familiar with working with sensitive data, who can also work exclusively from secure facilities, allows Sigma to implement a number of physical security measures that provide optimal control over access to client data.
Treating Client Data with Care and GDPR Compliance
The annotators would handle personal data that fell under GDPR restrictions — meaning data that’s unique to the individual user and could theoretically be used to identify that person and their intentions. As a 100% GDPR compliant organization, Sigma understood what it took to assure the highest level of data privacy throughout the entire data annotation process.
In order to obtain GDPR compliance, Sigma has internal experts in data protection, GDPR, and cyber and physical security. They also undergo annual third-party reviews and audits, and obtain external expert advice from one of the top 5 global consultancy firms. All employees receive annual training on GDPR and other security protocols, so that they’re not only compliant, but can understand why measures are implemented and promote correct application of security measures within the organization. Beyond GDPR, Sigma is also ISO27001 and SOC-2 type II certified.
Designing Physical Security
In only 10 weeks, the Sigma team designed and rolled out a dedicated secure facility with secure procedures, from consulting the client, to design and planning, to operation. In just 6 more weeks, they had a team of over 400 annotators in all 8 languages — hired, trained and working from the new facility. The security protocols Sigma implemented exceeded the client’s original specifications, and included several additional measures that Sigma recommended to the client based on their experience with sensitive projects.
Only annotators and project managers working on the project were able to enter the building. Access to the facility was protected in multiple ways. 24/7 manned security at the building entrance assured that only authorized personnel could proceed to the secure area. This was confirmed with two factor authentication that included employee ID badges and biometrics. The secure area and office entrance were constantly monitored by video and recorded and saved locally for the maximum legal duration. The building was completely locked down outside of office hours.
Sigma designed additional physical security measures to protect the data from leaving the facility. Annotators could not bring any personal items or devices into the secure area, and were provided lockers in front of the entrance. Security guards confirmed that no objects were forgotten with metal detectors at the entrance. Similarly, they were checked by security on exiting that nothing was removed from the secure area. All emergency exit doors were alarmed.
Security Mindset from Start to Finish
Respect for client confidentiality and a security mindset towards personal data extended into every aspect of the project’s implementation, from employee onboarding to data delivery. Sigma incorporates a number of agreements into its employee hiring procedures as standard, including a code of ethics, acceptable use policies and non-disclosure agreements. For this project, and all others that involve confidential or personal data, teams carried out all of the mandatory training courses within the secure zone. Annotators were tested and reviewed on their understanding of and adherence to the security protocols, and reminders were posted around the facility.
No user data, project data, guidelines, or even annotator communications were accessible outside of the facility. Cybersecurity protocols included restricting internet access and using proprietary chat tools that only functioned from the local machines in the facility. These measures were on top of periodic penetration tests and external security audits that Sigma undergoes as a matter of course.
After an initial pilot phase, the client doubled the volume of work and number of annotators — and continued to work with Sigma for multiple new secure annotation projects.