fbpx

We’re on the brink of bringing intelligent automation to nearly every facet of our lives. However, baked into many of the algorithms is the ugly racial bias that has plagued humanity forever. “If you don’t fix the bias, then you’re automating the bias” said Alexandria Ocasio-Cortez, on the subject of an IBM facial recognition project that would allow police to search for suspects by skin color and other biased AI projects.

I think it’s important that a budding politician is getting behind this narrative and I’m sure other politicians are going to follow suit. However, if there’s anything we’ve learned thus far, it’s that politics and technology exist in entirely different leagues.

Technology is like the first kid in class to hit puberty, while politics is the late bloomer. For instance, remember the time that the US Government thought Microsoft held a monopoly on the entire future of consumer technology? Today, you won’t even find Microsoft among the ranks of the Top 5 consumer technology brands.

It goes to show that we cannot (and should not) rely on the government to step in and regulate this issue. They operate on different planes of understanding. Fortunately, we’re in the very early stages of AI and still have time to mitigate the issue.

Realistically, we can count on a few hands how many companies / services we use that are employing AI that may or may not be corrupted by the racial bias – YouTube, Facebook, content platforms like Spotify and Netflix, along with a few rare use cases in healthcare and criminal justice. We are still in the early-adopter stage.

However, Kevin Kelly famously said, “The business plans of the next 10,000 startups are easy to forecast: take X and add AI.” We’re going to have a lot more AI companies drum up in the next ten years. This means that the sooner we can nip this problem in the bud, the less clean up we’ll have to do down the road. I believe one of the many potential solutions comes in the form of standardizing some of the data sets that people train their AI on.

Diverse Data Brokering

As you know, AI is only as good as the data it learns from. And since a lot of the data sets out there today operate in biased territory, we’re getting biased AI. There’s a huge market opportunity for a company to build diverse data sets for developers to train their algorithms on. This sort of data brokering has been done before, but not with the level of polish and respect we would like to see.

For instance, USDate is a company that sells online dating profile data at massive scale to people that are looking to start their own online dating sites (or just want a ton of data). Factual and BDEX are two other public marketplaces to buy consumer data ranging from location data to purchasing history. It’s pretty shady, considering I can buy hundreds of thousands of data profiles for less than a hundred bucks in some cases. And, you also have no idea to what extent the data is real or not.

Unfortunately, this is the current landscape of public data brokering. It’s an unregulated, back-alley type business that is ripe for a change.

Proprietary data is going to be one of the most valuable assets in any company’s future – whether they are your local dentist’s office or a 10,000-member marketing firm. Therefore, there will need to be an influx of data brokering sites that are far more official and trusted, with much better data sets.

Mattermark was one of the companies headed in the right direction, until they were acquired by FullContact and their data was privatized. If you aren’t familiar Mattermark provided in-depth data reports and trends on the private startup industry. It quickly became an enormously valuable resource and was even coined the Bloomberg Terminal for Startups.

What’s interesting about Mattermark is that they created data where data didn’t really exist. They had to dig around, piece together resources, and basically create their own data.

This is how I see the next generation of data brokering sites overcoming this racial bias of data sets. They’re going to need to dig around and create their own inclusive datasets to ensure that the next class of AI is built upon data that lacks bias.

It’s a large undertaking, but whomever gets it right will not only be very rich but also alleviate the concern that human racism is bleeding into AI.