How a Computer Program Can Learn All About You From Just Your Facebook Likes
Your publicly available “likes” can tell others a lot you wouldn’t expect—including your political views, sexual orientation and religion
Odds are, when you decided to “like” a TV show, band, local business or product’s Facebook page, you didn’t imagine that that click would have much consequence. It might show your friends a bit about your interests, and occasionally cause status updates from the page to show up in your news feed.
“Likes,” however, are publicly available for anyone to see on Facebook, even people you haven’t approved as friends. And for a new study published today in the Proceedings of the National Academy of Sciences, a group of researchers created a computer program that can take a user’s “likes” and accurately infer a tremendous range of information about him or her—including age, ethnicity, IQ, political leanings, level of drug use and even sexual orientation.
For the study, the research group—a partnership between the Psychometrics Lab at the University of Cambridge and Microsoft Research Cambridge—analyzed the data of 58,000 American Facebook users who had chosen to supply their profiles and “likes” for analysis through Facebook’s myPersonality app. The researchers fed these “likes” into an algorithm, built for this project in particular, and then compared the model’s predictions on a range of characteristics to what they knew for certain about the users, who had submitted the contents of their Facebook profiles for analysis as well.
For each pair of traits examined—say, Caucasian or African-American, or Democrat or Republican—the researchers picked a pair of users, with one belonging to each category, and the algorithm had to blindly pick which user fit which category merely based on their “likes.” It wasn’t 100 percent perfect at inferring any of the categories, but it was uncannily accurate at predicting many, including some characteristics you probably wouldn’t assume can be guessed from your “likes.”
It correctly inferred, for example, which user was Caucasian and African American 95 percent of the time, Democrat and Republican 88 percent of the time and Christian and Muslim 82 percent of the time. A breakdown of its accuracy in predicting many of the considered traits (as a reminder, a value of 1 would signify that the model is 100 percent accurate) is below.
For most of the users, this level of accuracy didn’t depend upon any obvious “likes” that one might link to the trait considered. For instance, less than 5 percent of the users identified as gay had “liked” gay marriage, or other related pages.
The algorithm, instead, aggregated tons of seemingly unrelated “likes” to group users into classes that shared predictable similarities. By comparing “likes” to the results of a personality test (also part of the myPersonality app), the researchers found that users who “like” “Thunderstorms,” “The Colbert Report,” “Science” or “Curly Fries” are all slightly more likely to have high IQ than those who don’t. Similarly, male users who “liked” “Mac Cosmetics” or “Wicked The Musical” were slightly more likely to be gay, whereas those who liked “Wu-Tang Clan” or “Shaq” were slightly less likely.
Analyzing all of a user’s “likes” enabled the algorithm to create an overall portrait of them, but its accuracy was heavily influenced by the number of “likes” for each user. For those at the low end, with 1-10 likes, the predictions were no better than chance, but for those with 150 to 300 “likes,” the algorithm was able to improve its ability to guess the users traits to an even better degree.
The researchers primarily conducted the study to show just how much our publicly available information can tell about us. You might not publicly post your sexual orientation, political views or whether you use drugs, but this sort of program can analyze your “likes” and make pretty accurate guesses regardless.
Although the users had submitted their “likes” and profiles for analysis via a third-party app, Facebook’s default privacy settings mean that your “likes” are public to anyone. Already, Facebook’s own algorithms use these likes to dictate what stories end up in users’ news feeds, and advertisers can access them to determine which are the most effective ads to show you as you browse.