De-anonymization

De-anonymization is an important topic for anyone working with sensitive data, whether in the context of academic research, IT system design, or otherwise.

I remember a talk during a Massey Grand Rounds panel where a medical researcher explained how she could pick herself out from an ‘anonymous’ database of Ontarians, on the basis that her salary was public as an exact dollar figure, only people with her specific job had it, and she was the only woman in that position.

The more general idea is that by putting pieces together you may be able to identify somebody who someone else has made some effort to keep anonymous.

It’s a challenge when doing academic research and writing on social movements, when some subjects choose to be anonymous in publications. That means not just not sharing their name, but not sharing any information that could be used to identify them. That gets hard when you think about adversaries who might have access to other information (in an extreme case, governments with access to masses of information) or even just ordinary people who can combine information from multiple sources logically. The date of an event described in an anonymous quote might tell allow someone to look up where it happened online. Another quote in which a third party’s actions are described could be used to determine that the de-anonymization target wasn’t that person. And so on and on like the logical games on the LSAT or the intricacies of mole hunting.

Lee Ann Fujii wrote smart stuff about this, and about subject protection in research generally.

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. Between 2005 and 2007 I completed an M.Phil in IR at Wadham College, Oxford. I worked for five years for the Canadian federal government, including completing the Accelerated Economist Training Program, and then completed a PhD in Political Science at the University of Toronto in 2023. View all posts by Milan

10 thoughts on “De-anonymization”

. says:

2018-10-25 at 9:42 pm

Defcon 21 – De-Anonymizing Alt.Anonymous. Messages

—

How Tor Users Got Caught – Defcon 22
. says:

2019-01-09 at 10:46 pm

Sorry, your data can still be identified even if it’s anonymized

Urban planners and researchers at MIT found that it’s shockingly easy to “reidentify” the anonymous data that people generate all day, every day in cities.
. says:

2021-07-24 at 4:24 pm

Inside the Industry That Unmasks People at Scale

Unique IDs linked to phones are supposed to be anonymous. But there’s an entire industry that links them to real people and their address.

https://www.vice.com/en/article/epnmvz/industry-unmasks-at-scale-maid-to-pii
. says:

2021-10-16 at 4:39 pm

AI Fake-Face Generators Can Be Rewound To Reveal the Real Faces They Trained On

https://yro.slashdot.org/story/21/10/13/2116205/ai-fake-face-generators-can-be-rewound-to-reveal-the-real-faces-they-trained-on

—

This Person (Probably) Exists. Identity Membership Attacks Against GAN Generated Faces

https://arxiv.org/pdf/2107.06018.pdf
. says:

2021-10-19 at 8:51 pm

This appears to be Eric Trump’s incredibly depressing YouTube playlist.

https://slate.com/news-and-politics/2019/07/this-appears-to-be-eric-trumps-incredibly-depressing-youtube-playlist.html
. says:

2021-10-24 at 5:43 pm

They Stormed the Capitol. Their Apps Tracked Them

Times Opinion was able to identify individuals from a trove of leaked smartphone location data.

https://www.nytimes.com/2021/02/05/opinion/capitol-attack-cellphone-data.html
. says:

2022-02-16 at 2:02 pm

Never use pixelation to redact text

—

Never, Ever, Ever Use Pixelation for Redacting Text
. says:

2025-06-04 at 9:36 am

Meta and Yandex are de-anonymizing Android users’ web browsing identifiers

Abuse allows Meta and Yandex to attach persistent identifiers to detailed browsing histories.

https://arstechnica.com/security/2025/06/meta-and-yandex-are-de-anonymizing-android-users-web-browsing-identifiers/
. says:

2025-10-29 at 9:54 am

Republican plan would make deanonymization of census data trivial

“Differential privacy” algorithm prevents statistical data from being tied to individuals.

https://arstechnica.com/tech-policy/2025/10/republican-plan-would-make-deanonymization-of-census-data-trivial/
. says:

2026-03-03 at 10:30 am

LLMs can unmask pseudonymous users at scale with surprising accuracy

https://arstechnica.com/security/2026/03/llms-can-unmask-pseudonymous-users-at-scale-with-surprising-accuracy/

Burner accounts on social media sites can increasingly be analyzed to identify the pseudonymous users who post to them using AI in research that has far-reaching consequences for privacy on the Internet, researchers said.

The finding, from a recently published research paper, is based on results of experiments correlating specific individuals with accounts or posts across more than one social media platform. The success rate was far greater than existing classical deanonymization work that relied on humans assembling structured data sets suitable for algorithmic matching or manual work by skilled investigators. Recall—that is, how many users were successfully deanonymized—was as high as 68 percent. Precision—meaning the rate of guesses that correctly identify the user—was up to 90 percent.

Author: Milan

10 thoughts on “De-anonymization”

Leave a Reply