Appen declined to give an attributable comment.

“If we suspect a user has violated the User Agreement, Toloka will perform an identity check and request a photo ID and a photo of the user holding the ID,” Geo Dzhikaev, head of Toloka operations, says.

Driven by a global rush into AI, the global data labeling and collection industry is expected to grow to over $17.1 billion by 2030, according to Grand View Research, a market research and consulting company. Crowdsourcing platforms such as Toloka, Appen, Clickworker, Teemwork.AI, and OneForma connect millions of remote gig workers in the global south to tech companies located in Silicon Valley. Platforms post micro-tasks from their tech clients, which have included Amazon, Microsoft Azure, Salesforce, Google, Nvidia, Boeing, and Adobe. Many platforms also partner with Microsoft’s own data services platform, the Universal Human Relevance System (UHRS).

These workers are predominantly based in East Africa, Venezuela, Pakistan, India, and the Philippines—though there are even workers in refugee camps, who label, evaluate, and generate data. Workers are paid per task, with remuneration ranging from a cent to a few dollars—although the upper end is considered something of a rare gem, workers say. “The nature of the work often feels like digital servitude—but it’s a necessity for earning a livelihood,” says Hassan, who also now works for Clickworker and Appen.

Sometimes, workers are asked to upload audio, images, and videos, which contribute to the data sets used to train AI. Workers typically don’t know exactly how their submissions will be processed, but these can be pretty personal: On Clickworker’s worker jobs tab, one task states: “Show us you baby/child! Help to teach AI by taking 5 photos of your baby/child!” for €2 ($2.15). The next says: “Let your minor (aged 13-17) take part in an interesting selfie project!”

Some tasks involve content moderation—helping AI distinguish between innocent content and that which contains violence, hate speech, or adult imagery. Hassan shared screen recordings of tasks available the day he spoke with WIRED. One UHRS task asked him to identify “fuck,” “c**t,” “dick,” and “bitch” from a body of text. For Toloka, he was shown pages upon pages of partially naked bodies, including sexualized images, lingerie ads, an exposed sculpture, and even a nude body from a Renaissance-style painting. The task? Decipher the adult from the benign, to help the algorithm distinguish between salacious and permissible torsos.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Hidden Apple feature reveals if you’ve been making ‘battery killer mistakes’ on your iPhone – how to check

A BURIED iPhone feature has insights for keeping your device charged and…

Tracking Screen Time Is Ruining Your Life

A few years ago, I was sitting with my friends from high…

Newly-discovered 50-mile fault line in Canada could cause massive TSUNAMI in northwestern US if it ruptures

Over a million Americans in the northeastern US could be at risk…

Who owns the moon? As the race to return humans to the lunar surface heats up, experts reveal why ‘buying’ a plot of land might not be all that it seems

In 1969, the idea of building a permanent settlement on the moon…