Computers are close to having a perfect data storage. But perhaps it would be better if they would forget like humans?

Interview
How the Internet could forget

Humans do forget – annoying though important for human’s mental health. The Internet is another story. So far.

Human memory is anything but perfect when compared to a perfect data storage. Recalling memories is an error prone process and someone’s memories can even be misleading. However, forgetting is also a gift that is essential for the cure of trauma patients. Computers are way closer to have a perfect data storage than humans have. Florian Farke tells us whether a digital oblivion is meaningful or not. He is a PhD candidate in the Forschungskolleg Sec-Human, a doctoral program focusing on security of humans in Cyberspace from an inter- and transdisciplinary perspective.

Mr Farke, why is oblivion important in digital context?
There is this often cited case of a student teacher who posted a photo captioned “Drunken Pirate” showing her wearing a pirate hat and drinking from a plastic cup. She was not allowed to take her final examination after university staff found the photo.

In our work, we focus on the positive sides of oblivion, helping us to filter out obsolete or outdated information and preventing us from being overwhelmed by sensory impressions. Nonetheless, digital oblivion is criticized to fuel censorship and history rewriting.

How could the Internet forget in the age of social media, when everything is constantly shared and copied?
The Internet forgets. There is a loss of data because services are discontinued or servers are shut down – not everything that was once online is necessarily archived elsewhere and kept available. However, it is true that it is easy to copy and spread data on the Internet.

What does digital oblivion mean then?
There is no standard definition. Since the introduction of the EU General Data Protection Regulation (GDPR) in 2016, a comprehensive right to be forgotten exists in the European Union. Any EU citizen can obtain the erasure of personal data concerning him or her from the authority that controls the data. The organization or company in question then is obliged to erase the personal data unless the request was not legitimate or there are other reasons to keep the data like a mandatory retention period.

Florian Farke’s doctoral thesis deals with digital oblivion.

This right to be forgotten originates in a case of a Spanish citizen who sued Google to remove search results that came up when he searched his name. The search results included a news article about the forced sale of his property due to social insurance debt. At the time of the lawsuit, this event was ten years ago and no longer relevant in his view. In 2014, the court agreed with him and obliged Google to give EU citizens the opportunity to request the removal of search results appearing under their name.

This contradicts the whole idea of the World Wide Web and the Internet.

So forgetting on the Internet is regulated by the GDPR?
Only the EU and its member states are bound by the GDPR. National borders have little meaning on the Internet. Establishing borders as some politicians demand would radically change the Internet as we know it today. I think this will amplify the effect of filter bubbles, i.e., isolation from inconvenient opinions, by adding national bubbles. The concept of borders contradicts the whole idea of the World Wide Web and the Internet.

What are the alternatives to the GDPR?
There are various approaches. For example, data with no access for a long time seem to be irrelevant and may be forgotten. It is unnecessary to delete them. You can encrypt the data before upload and publish the key. To forget and revoke access to the data it is sufficient to erase that key.

Are such approaches already used in practice?
As far as I know, no mechanisms for digital oblivion are broadly used on the Internet. However, there are implementations. There was, for example, a service to encrypt and upload images. Once the key was deleted, the images were also gone. The implementation was extremely straightforward but not very user friendly. Users had to install an add-on in their browsers – which was required for uploading and viewing the images. Furthermore, they needed to set an expiration date for the image. That is something you definitely do not know beforehand in all cases.

Is it realistic that such approaches will be used throughout the Internet?
It is not entirely clear whether users would like to have a digital oblivion and how it could be implemented. There are some user studies, but they do not fully answer the whether and how. On the other hand, we are currently experiencing a centralization of the Internet. There are a handful of large services, which are used a lot. The search engine Google or the online social networks of Facebook, to name two. They have direct access to the data and thus are able to delete it – which they usually do not want to do because it contradicts their business model.

I think the problem cannot be solved only by technology in practice; it only works if it is legally regulated. The General Data Protection Regulation is a first step in this direction, but the right to forget is very vague and many service providers do not know yet how to implement it. This was a deliberate decision by the lawmakers to allow solutions to emerge from practice.

What does this mean in practice?
According to Google’s transparency report, about 760,000 requests to delist URLs from search results were filed between May 2014 and December 2018. The requests are reviewed and decided manually by Google staff. This review process is expensive and probably not affordable for smaller businesses or start-ups. One can argue that smaller businesses do not receive so many requests. But more and more data is collected by devices and services, especially in the context of the Internet of Things. Requests to delete personal data from customers or former customers are challenges that companies have to deal with. Of course, companies can simply delete whenever someone asks them to do so. This might not be a big deal for Internet of Things devices, but in online social networks it quickly ends in censorship. Just think of a politician who tries to cover up compromising statements.

I am currently looking into ways to automate the process. Not entirely, of course, that is not possible yet. Legal decisions are too complex. However, if it was possible to classify the requests automatically, it could speed up the decision-making process.

It seems difficult to hit the sweet spot to allow technology to forget but prevent censorship.
Indeed it is. In my doctoral thesis, I am interested in the user’s view. What do users actually want? Should the Internet forget? Have users already accepted that data is stored indefinitely? However, it is difficult to ask users without biasing them. Responses may be very different depending on the context. If one thinks of the examples of the Drunken Pirate photo and the politician’s statement, the context is very different. At the moment, I am looking for the best method to answer these questions.

Finally, what do you think: Should technology be changed to be able to forget? Or do people have to adapt to the new technology?
The ability to forget fulfils an important purpose in many societies. I think we should adapt technology accordingly.