Recently, I received a notification in my TikTok app saying that my data request had been successful and that I could now download my personal file. I’m a regular user, and as a Data Scientist, I’m always eager to grasp any opportunity to better understand my own data footprint. Even knowing what I know about online data collection and how powerful this can be in empowering better products, services and opportunities for users, I still tend to feel a little uncomfortable at the sheer volume of what’s returned.
I downloaded a zip file which housed the archives of my personal information, both public and private. This, of course, included every single post and comment I’d ever created or engaged with, but also my entire chat history from private messages, and a log of every time I’ve used the app and how- where I was, which network I was on, and timestamps.
Of course, this isn’t anything new. I work with these types of datasets regularly in my job. Many social media companies are collecting this data, but it still feels jarring as a user to see it so readily accessible in one place. It serves as a reminder for how extensive our digital footprints are, perhaps often without us realising. This is heightened in the case of TikTok, which has been prevalent in recent US political discussions regarding national security and concerns around the apps data protection policies in the press (this WIRED article from July neatly summarises how the concerns played out).
Even if the content itself isn't particularly sensitive, it’s often still surprising to see our activity, both public and private, strung out before us in broad daylight, and makes the concern of this file getting into the wrong hands feel real. It highlights the importance of policy and regulation in holding these companies accountable; which are thoughts that we tend to push to the back of our minds, only to resurface when we are facing them head-on, or when it’s already too late. To me, this experience has raised the question of how these data requests could be used for the purpose of education in helping people to understand the extent of their digital footprint, and the amount of trust we place in apps to protect this data properly.
Of course, data retention is not the core issue here- in my job; more data often means a better service for everyone. The bigger concern is what comes after- which other service providers are collecting similar information, and how is it being used? And ultimately, how can we better protect it?
This is a mini-blog post that barely scratches the surface of these discussions, that I explore much more deeply in my day to day work. I am always happy to chat further about social media, data usage and ethics; please do contact me or ping me on one of my socials links with your thoughts.