As a white male in America with no discernible regional accent, I can simply assume that modern consumer technologies (like virtual assistants and the camera on my phones) will work flawlessly right out of the box. I assume this because, well, they do. That’s because the nerds who design and program these devices look and sound like me, just a little whiter. People with and extra do not enjoy that same privilege.
The chatbots and visual AIs of tomorrow will only serve to exacerbate this bias unless steps are taken today to ensure a benchmark standard of fair and equitable behavior for these systems. To address that issue, Meta AI researchers developed and released the , designed to “help researchers assess the accuracy of their computer vision and audio models across a diverse set of ages, genders, apparent skin tones, and conditions.” of ambient lighting”. On Thursday, the company introduced Casual Conversations v2, which promises even more granular sort categories than its predecessor.
CC’s original dataset included 45,000 videos of more than 3,000 paid subjects by age, gender, apparent skin tone, and lighting conditions. These videos are designed for use by other AI researchers, specifically those working with generative AI like ChatGPT or visual AI like those used in social media filters and facial recognition features, to help them ensure that their creations behave the same way, regardless of the appearance of the user. Anya Taylor-Joy or Lupita Nyong’o, whether they sound like Colin Firth or Colin Quinn.
Since Casual Conversations first debuted two years ago, Meta has worked “in consultation with internal experts in fields such as civil rights,” according to Tuesday’s announcement, to expand and improve the data set. Professor Pascale Fung, Director of the AI Research Center, as well as other researchers from the Hong Kong University of Science and Technology, participated in the literature review of government and industry data to establish the new annotation categories.
Version 2 now includes 11 categories (seven self-reported and four investigator-annotated) and 26,467 video monologues recorded by nearly 5,600 subjects in seven countries: Brazil, India, Indonesia, Mexico, Vietnam, the Philippines, and the US. there are so many individual videos in the new dataset, they are much more annotated. As Meta notes, the first iteration only had a handful of categories: “age, three gender subcategories (female, male, and other), apparent skin tone, and ambient lighting,” according to Thursday’s blog post.
“To increase non-discrimination, fairness, and safety in AI, it is important to have inclusive data and diversity within data categories so that researchers can better assess how well a specific model or AI-powered product works for different demographic groups,” Roy Austin, Meta’s vice president and deputy general counsel for civil rights, said in the statement. “This dataset has an important role to play in ensuring that the technology we build takes equity into account for everyone from the start.”
As with most of its public AI research to date, Meta is releasing Casual Conversations v2 as an open source dataset for anyone to use and extend, perhaps to include markers like “disability, accent, dialect.” , location and recording settings,” as the company hinted on Thursday.