De-anonymization

De-anonymization is an important topic for anyone working with sensitive data, whether in the context of academic research, IT system design, or otherwise.

I remember a talk during a Massey Grand Rounds panel where a medical researcher explained how she could pick herself out from an ‘anonymous’ database of Ontarians, on the basis that her salary was public as an exact dollar figure, only people with her specific job had it, and she was the only woman in that position.

The more general idea is that by putting pieces together you may be able to identify somebody who someone else has made some effort to keep anonymous.

It’s a challenge when doing academic research and writing on social movements, when some subjects choose to be anonymous in publications. That means not just not sharing their name, but not sharing any information that could be used to identify them. That gets hard when you think about adversaries who might have access to other information (in an extreme case, governments with access to masses of information) or even just ordinary people who can combine information from multiple sources logically. The date of an event described in an anonymous quote might tell allow someone to look up where it happened online. Another quote in which a third party’s actions are described could be used to determine that the de-anonymization target wasn’t that person. And so on and on like the logical games on the LSAT or the intricacies of mole hunting.

Lee Ann Fujii wrote smart stuff about this, and about subject protection in research generally.

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. Between 2005 and 2007 I completed an M.Phil in IR at Wadham College, Oxford. I worked for five years for the Canadian federal government, including completing the Accelerated Economist Training Program, and then completed a PhD in Political Science at the University of Toronto in 2023.

10 thoughts on “De-anonymization”

  1. LLMs can unmask pseudonymous users at scale with surprising accuracy

    https://arstechnica.com/security/2026/03/llms-can-unmask-pseudonymous-users-at-scale-with-surprising-accuracy/

    Burner accounts on social media sites can increasingly be analyzed to identify the pseudonymous users who post to them using AI in research that has far-reaching consequences for privacy on the Internet, researchers said.

    The finding, from a recently published research paper, is based on results of experiments correlating specific individuals with accounts or posts across more than one social media platform. The success rate was far greater than existing classical deanonymization work that relied on humans assembling structured data sets suitable for algorithmic matching or manual work by skilled investigators. Recall—that is, how many users were successfully deanonymized—was as high as 68 percent. Precision—meaning the rate of guesses that correctly identify the user—was up to 90 percent.

Leave a Reply

Your email address will not be published. Required fields are marked *