The Data Factories and Virtual Sweatshops that make the Internet run smoothly

The Rundown: 

We’re in this legal limbo where Big Tech companies can employ people for mere pennies to qualify datasets and improve their AI – effectively creating Data Factories and Virtual Sweatshops. This cheap labor helps the Internet and AI research function. But the millions of workers in this unregulated territory are being sacrificed for the greater whole.

If history class taught me anything about the Industrial Revolution, it was that the regulations on working conditions were non-existent. No age limit. No hour limit. No minimum wage. Factory jobs were not sweet, but if it was your only way to survive, then you accepted it.

Unfortunately, it appears that history is repeating itself.

Hundreds of thousands of people are finding themselves in this exact same position, except with a digital twist. The amount of legal ambiguity in the data industry is absurd. Thus, we have unregulated, virtual sweatshops of the Internet.

Already a future thinker?
Then become a friend.

Data Factories

The largest constraint to improving artificial intelligence is obtaining a quality stream of data that is useful to the given algorithm. In other words, artificial intelligence doesn’t know what it doesn’t know. And it’s up to humans to tell it what it needs to know.

For something as seemingly simple to us humans as identifying road signs, this equates to tens of millions of labeled images (containing the numerous visual scenarios) needed in order for the algorithm to successfully identify a sign. That’s a lot of annotated photographs, which is why there’s an entire economy around purifying and preparing data for AI.

Mechanical Turk is one of these websites that offer people pennies to tag photos, complete surveys, and other HITs (Human Intelligence Tasks) that’ll help AI systems learn. Companies such as Twitter, LinkedIn, Dropbox, DARPA, will put their tasks on Mechanical Turk at insanely cheap labor rates.

Ars Technica did an exposé on the lifestyles of Mechanical Turk workers, and it’s just horrendous. We’re talking about completing hundreds or thousands of these micro-tasks daily just to scrape together around $6.50 per hour. It really is the assembly line of the 21st century. And there’s no regulation:

The tasks that pay the best and take the least time get snapped up quickly by workers, so Erica must monitor the site closely, waiting to grab them. She doesn’t get paid for that time looking, or for the time she spends, say, getting a glass of water or going to the bathroom. Sometimes, she has to “return” tasks—which means sending them back to the requester, usually because the directions are unclear—after she’s already spent precious time on them.

Alana Semuels, The Atlantic

It’s really a bad situation for the workers in the developed world where the cost of living far exceeds what they can earn. This is why companies like Samasource outsource some of these data prep jobs to places like Kenya, where a few dollars a day is a somewhat respectable wage.

Another role that falls into this Data Factory category, and perhaps the worst part of the Internet’s underbelly, is content moderation.

The Internet Cleaners

The Cleaners is a PBS documentary about the thousands of Internet and social media content moderators that clean up the graphic junk that pollute the Internet. Beheadings, sexual abuse, hate crimes, child exploitation, terror threats – these are horrible things that we never have to come across in our daily searches thanks to these moderators that remove them.

As if the graphic nature of the job weren’t enough, they must review 25,000 pictures a day just to meet their requirements. If they’re working eight hour days (which I highly doubt), that equates to reviewing an image/video every 1.15 seconds. For a twelve hour work day, that’s an image every 1.72 seconds. They are literally tasked with moderating free speech in the free world in mere seconds. That’s a lot of responsibility.

Not to mention the horrible repercussions. Imagine seeing violent crimes and disturbing images of children every day for six straight years. That’ll change you as a person. It’s truly a selfless job that benefits the whole while weakening the few.

Whether you’re tagging images, moderating content, taking surveys, these are the Data Factories of the 21st century. They are unregulated, ridden with horrible working conditions, and necessary for the functioning of digital society.

Our dependence on data is not slowing down anytime soon. We’ve only seen great penetration in the technologically savvy companies and already millions of people work in data handling. Clickworker, another one of these micro-tasking companies, claims 1.3 million clickworkers. That’s just one of many platforms providing these micro-tasks.

In rural towns where the economies are stricken, in overpopulated cities where the opportunities are scarce, people with few options to earn a living will continue filing into these Data Factories.

It’s hard not to be angry at this outlook. It’s hard not to lose sleep over this livelihood. That’s why it’s best to be thankful for their contribution to making your life better.