What happens when the Internet Archive gets erased?

History is always written by the winners. It’s just the nature of the game. The conquerors speak highly of themselves while the conquered don’t get any press. The Internet, however, has made a case for the opposite – a history where all sides of the story are covered, first hand. Unfortunately, theory and practice don’t always match up.

Although the Internet has made it easier for everyone to share their story, it doesn’t mean that information can’t be erased.

The Internet Archive, a nonprofit that saves old copies of webpages and other digital information, in a blog post yesterday, explained that it received more than 550 takedown notices from the European Union in the past week “falsely identifying hundreds of URLs on archive.org as ‘terrorist propaganda’.”

The notices came from Europol’s European Union Internet Referral Unit (or EU IRU) and its French counterpart. They included URLs for major collection pages, each containing millions of items (e.g., “https://archive.org/details/texts” and “https://archive.org/details/television”) as well as links to scientific research and US government reports, including TV footage from CSPAN.

James Vincent, The Verge

The history books that cover the late 20th and early 21st century will be predicated largely on what happened on the Internet. How different memes traversed culture, influenced movements, and sparked massive change. How individuals, organizations, and companies used the Internet to gain power and exert their ideas.

This is why the Library of Congress has instated 12 years worth of public Tweets into its archives in order to document the rise and evolution of Twitter. More generally, Archive.org and Wayback Machine are non-profit institutions devoted to preserving the history of the Internet.

My main concern is how stealthily these fragments of history and culture can disappear. When Hitler burned books and stole artwork to erase Jewish intellectualism, it was a spectacle. We were aware that it was happening. Today, this can happen without any of us truly knowing.

Outside of the non-profit archival institutions, most of the Internet’s data and history exist on the cloud servers of three companies: Google, Amazon, and Microsoft. If any one of these three companies goes “evil”, they effectively can wipe out ⅓ of the World’s digital media and information.

This is not meant to sound conspiratorial. It’s just meant to drive across the fact that we place immense trust in the very few to not exert information oppression.

Internet Anthropologists and Data Historians are already going to have a difficult time piecing together the true nature of today’s wild, sporadic, and hard to follow events. They don’t need to be further challenged by deleted or tainted information.

On a more private note, we should all be concerned about our personal data. Not the personal data that companies use to advertise their products to you, but rather the family photos, media collection, and overall precious digital items we entrust to the cloud. This is what Ryan had to say on the matter:

Overall, I don’t think this Archive.org press is the start of something cataclysmic. I don’t think that you’re going to lose your entire library of photos in the cloud anytime soon. However, we know the impact that information oppression can have at scale. We even have movies like Blade Runner, 2049 and Fahrenheit 451 which show what happens when digital and physical media is wiped from existence. We can’t act ignorant to this possibility.