Facebook and data mining

I have written before about privacy and Facebook, expressing the view that people should treat whatever they put on Facebook in the same way as they treat something they put on a completely public website at this one. It may be wise to give people more granular control over who can see what, but it isn’t intelligent as a Facebook user to assume that their privacy controls will always be adequate and that your information will stay safe.

In the wake of the latest Facebook privacy debacle, I have realized that there is an element to the situation that I hadn’t considered before. Especially now that Facebook is working to put everybody’s ‘Interests’ into a standardized format, there is a real difference between how information on Facebook can be used, compared to the wider web.

A person with some time and interest could scan through my blog, figure out about how old I am, learn what sort of books I read, discover my political views, and so on. It would be rather tricky to write an automated computer program that would achieve the same result. Blogs are non-standardized, and comprised of human generated text. By contrast, information on Facebook is increasingly organized in a manner that is easily machine readable. If I want to reach 25-27 year olds who enjoy reading Carl Sagan books and live in Ottawa, it is easy to do via the information on Facebook, but hard to do with information from the general web. That seems to comprise a different sort of privacy violation and/or data mining.

In response, I have stripped my Facebook account of everything that might be of interest to advertisers, at least where it is easily machine-readable: hometown, current location, music and films appreciated, etc. A determined human user could still learn a lot about me from Facebook, for instance by looking at status updates and communication with others, but this will at least make it a bit trickier for machines.

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. In the fall of 2005, I began reading for an M.Phil in IR at Wadham College, Oxford. Outside school, I am very interested in photography, writing, and the outdoors. I am writing this blog to keep in touch with friends and family around the world, provide a more personal view of graduate student life in Oxford, and pass on some lessons I've learned here.

12 thoughts on “Facebook and data mining”

  1. Timely given today is Quit Facebook Day. Setup to protest the privacy settings debacles.


    Curious that you didn’t delete it completely. I suppose it is still a useful communication tool but I wonder what their policy is in even monitoring that (wall posts, private messages, etc.). I know google scans gmail content to target ads but don’t sell that info to 3rd parties.

  2. I won’t delete it because:

    a) It is a good way to share photos
    b) It helps me keep up with distant friends who I would not otherwise speak with.

  3. I am not on FB myself but the wife is. Ease of use is a big plus (my 85 year old great aunt use it) but for photos I hate the concept of tagging.

    Obviously I can control photos I take and post but cannot do so of others who happen to take them of me then post them and tag me (with my wife’s account, I don’t look like a Stephanie!) granted the people that have done so have asked permission at least, this won’t always be the case, especially when someone else can tag you on your behalf.

    Tagging aside, as a you are a user of photo.net and Picassa, how do you find FB albums compare to those services? I wonder how it compares to Flickr, which I have some experience with.

  4. You’re right about how easy it to full data from fields. I take a small pleasure in the FB ads that don’t know what to pitch at me so throw it all: senior’s cruises, discount Christian novels, men’s muscle gain products and disability benefits (hey! I’m not online that much.).

    I never added personal data to FB, address or phone. It never had my gender or birthyear. It links to my blogs which are public. It wouldn’t have my relationship status if it weren’t for hubby standing over my shoulder insisting I add it as I insisted this is not a dating site.

    My pov is that once anything is out from between your ears or off your fingers, you can’t control the information, online or off, and can presume it is possible even if not probable that it might be misused.

  5. “Of course, companies have long mined their data to improve sales and productivity. But broadening data mining to include analysis of social networks makes new things possible. Modelling social relationships is akin to creating an “index of power”, says Stephen Borgatti, a network-analysis expert at the University of Kentucky in Lexington. In some companies, e-mails are analysed automatically to help bosses manage their workers. Employees who are often asked for advice may be good candidates for promotion, for example.

    Ellen Joyner of SAS, an analytics firm based in Cary, North Carolina, notes that more and more financial firms are using the software to uncover fraud. The latest version of SAS’s software identifies risky borrowers by examining their social networks and Internal Revenue Service records, she says. For example, an applicant may be a bad risk, or even a fraudster, if he plans to launch a type of business which has no links to his social network, education, previous business dealings or travel history, which can be pieced together with credit-card records. Ms Joyner says the software can also determine if an applicant has associated with known criminals—perhaps his fiancée has shared an address with a parolee. Some insurers reduce premiums for banks that protect themselves with such software.”

  6. “Every few weeks, it seems, Facebook is caught again violating users’ privacy. A code error there, rogue business partners there. The truth, as InfoWorld’s Bill Snyder explains, is that Facebook will keep on violating your privacy, no matter what its policies say, what promises it makes, or how shocked it claims to be at the latest incident. The reason is simple: Selling personal information on its users is how it makes money, and Facebook is above all a business.

  7. Data Mining: How Companies Now Know Everything About You
    By Joel Stein Thursday, Mar. 10, 2011

    Three hours after I gave my name and e-mail address to Michael Fertik, the CEO of Reputation.com, he called me back and read my Social Security number to me. “We had it a couple of hours ago,” he said. “I was just too busy to call.”

    In the past few months, I have been told many more-interesting facts about myself than my Social Security number. I’ve gathered a bit of the vast amount of data that’s being collected both online and off by companies in stealth — taken from the websites I look at, the stuff I buy, my Facebook photos, my warranty cards, my customer-reward cards, the songs I listen to online, surveys I was guilted into filling out and magazines I subscribe to.

    Google’s Ads Preferences believes I’m a guy interested in politics, Asian food, perfume, celebrity gossip, animated movies and crime but who doesn’t care about “books & literature” or “people & society.” (So not true.) Yahoo! has me down as a 36-to-45-year-old male who uses a Mac computer and likes hockey, rap, rock, parenting, recipes, clothes and beauty products; it also thinks I live in New York, even though I moved to Los Angeles more than six years ago. Alliance Data, an enormous data-marketing firm in Texas, knows that I’m a 39-year-old college-educated Jewish male who takes in at least $125,000 a year, makes most of his purchases online and spends an average of only $25 per item. Specifically, it knows that on Jan. 24, 2004, I spent $46 on “low-ticket gifts and merchandise” and that on Oct. 10, 2010, I spent $180 on intimate apparel. It knows about more than 100 purchases in between. Alliance also knows I owe $854,000 on a house built in 1939 that — get this — it thinks has stucco walls. They’re mostly wood siding with a little stucco on the bottom! Idiots.

  8. Even if private citizens do not make much use of face recognition to search their archives, it seems a fair bet that governments will—perhaps only in special circumstances, perhaps not. In America, warrants to seize user data from Facebook often also request any stored photos in which the suspect has been tagged by friends (though the firm does not always comply). Warrants as broad as some of those from which the National Security Agency and others have benefited in the past could allow access to all stored photos taken in a particular place and time.

Leave a Reply

Your email address will not be published. Required fields are marked *