Everybody Lies: Except in a Google Search

Don’t bother answering questions by the next pollster who calls to do a survey. You’re probably going to lie to him. Because “everybody lies.” And there’s no point in taking a survey if you’re going to lie. Besides, Google’s already got you on the truth meter.

That’s one of the main discussion points in the new book, “Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are.”

The blurb on the book says, “By the end of an average day in the early 21st century, human beings searching the Internet will amass 8 trillion gigabytes (GB) of data.” Every day, 8 trillion GB. What does that even amount to? Who knows, but it’s a lot. The average computer has about 4 GB of memory. A flash memory card in a camera may store 16 GB. We’re talking 8 trillion GB – daily.

So what are people searching? Pretty much everything, according to “Everybody Lies” author Seth Stephens-Davidowitz.  And the data these searches reveal can be one useful tool for putting the human psyche under the microscope.

“People are honest on Google. They tell Google what they might not tell to anybody else. They’ll confess things to Google that they wouldn’t tell friends, family members, surveyors, or even themselves,” Stephens-Davidowitz said Tuesday in remarks about his book.

Take, for instance, some of the common confessional-style searches that Google gets: “I hate my boss,” “I’m happy,” “I’m sad,” or even “I’m drunk.”

Some of the searches can become rather morose and depressing. For instance, after the San Bernardino attack in 2015, in which 14 people were killed and another 22 seriously injured, top Google searches that soon followed included “Muslim terrorists” and “kill Muslims.” Stephens-Davidowitz says certainly it lacks context to try to guess what people were trying to express in the search, but it also provides guidance.

Here’s one way the data were used. Shortly after the attack, President Obama delivered a speech to try and calm fears about Muslims in America. But his grandiose sermonizing about opening America’s hearts backfired. Even during the speech, people got angrier. But at one point, Obama said that we have to remember that Muslim-Americans are our friends and neighbors, they are sports heroes, and members of the military who are willing to die to defend this country.

Immediately, while the speech was still being given, Google searches for “Muslim athletes” spiked. The increase was so notable that when Obama gave a speech a couple weeks later on the same topic, he skipped the lecturing and focused on the contributions of Muslim-Americans.

Stephens-Davidowitz argues that while Obama’s sermon didn’t tell anybody anything that they didn’t know, the line about sports heroes provoked curiosity, provided potentially new information, and redirected attention. This may not indicate that there’s a science to calming fears after a terror attack, but it does show the power of the data to change how people act and react.

Stephens-Davidowitz says part of the reason why data searches are more useful than old-fashioned survey questions is because people tend to lie in surveys to make them look good. It’s called social desirability bias. It happened during the elections of 2008.

During that time, most Americans surveyed said Obama’s being black didn’t matter. Yet during the election, there was a spike in racist term searches. And graphing that data revealed that racist term searches were geographically divided between East and West. While correlation is not causation, where the racist term searches spiked, Obama lost about 4 percentage points of the vote over the previous Democratic candidate (John Kerry) in Democratic strongholds. He also generated a 1-2 percentage point increase in the number of African-Americans who voted.

Map of Google searches of racist content

The book, “Everybody Lies,” isn’t entirely about politics. It talks about a variety of topics like the stock market, crime, sports, and of course, sex, a hugely commercial enterprise on the Internet. In one example about the truth of big data, Stephens-Davidowitz notes that American women said in recent polling that they had sex (hetero and homosexual sex) once a week and used condoms about 20 percent of the time. Extrapolating the numbers, that would mean about 1.6 billion condoms were used that year. But asking men the same question (about hetero and homosexual sex) resulted in just 1.1 billion condoms allegedly used that year.

So who’s telling the truth, men or women? Neither. According to sales reports, just 600 million condoms were sold during the year in question.

Stephens-Davidowitz conjectures that people have an incentive to tell the truth to Google in a search, more so than to a pollster asking a survey, because they need information. For instance, an increase in the search volume for voting places in an area in the weeks leading up to an election is more likely to reveal whether turnout is going to be high in that location than whether a pollster finds that 80 percent of the people say they will vote.

But is Internet search a digital truth serum? Is it the best way to get real answers? Yes and no.

It depends on how available other high quality data are. For instance, Google flu, which attempted to determine how sick the population was during flu season based on searches about symptoms, was not as accurate as flu modeling currently used by government agencies like the Centers for Disease Control and Prevention.

Furthermore, what people search doesn’t explain why people search. Likewise, Google doesn’t identify who’s searching so we don’t know if the search is a representative sample of the population. There’s no way of knowing what an absolute level of response would generate. For that, we need lots of different types of data.

But Internet searches may be useful in measuring the human psyche more so than in predicting futures. Big data can be helpful in looking at information that does not require very precise numbers. Predicting an election within 5 percentage points isn’t helpful. But it probably is not a big deal to be off by 10 percent when counting the number of condoms used in a year.

As for topics like child abuse, Stephens-Davidowitz says that he’s not actually sure how to use the data to help governments and protective agencies develop programs to identify and address abuse, but that it’s certainly information that would be helpful in filling a gap in reporting. And like any pollster worth his salt will tell you, being able to ask the right question is one vital way of getting to an accurate answer.

Watch the remarks by Stephens-Davidowitz.

Fake News May Distract, But It Doesn’t Rig Elections

Fake news is not a technical glitch.” This sentence is the headline of a recent article about the hysteria that has enveloped the nation over the “unexpected” presidential outcome. It also is a simple explanation that clears up much of the confusion being disseminated since the Nov. 8 vote.

Ironically, there has been a lot of misinformation about what “fake news” is. Is it false stories made up whole cloth? Yes. Is it misreporting about events that have happened? No, but that’s become a much-discussed point about the journalism profession since the issue arose. Is it media opinion? No.

Blaming members of the media for expressing their opinion rather than just stating the facts of a news story has been a complaint for decades, if not centuries. Not reporting all the facts is poor journalism, but a lie of omission is not the issue at hand.

Fake news is “creative writing,” to be kind. It’s the act of crafting imaginary facts about people whose opponents would be willing to believe are true. It’s pernicious, but it isn’t merely bad journalism. It is not based in fact at all.

Yet, people are willing to believe what they are told is news because Americans trust the format.  “Crankish conspiratorial thinking has been a theme in America for a long time,” notes professional software engineer and blogger Ariel Rabkin.

But there has been an outcry at the platforms that have unwittingly served as dispensers of fake news. The messenger has been condemned as much as the fake news itself.

Blaming the messenger — the online platforms where this fake news appears — is not the answer, however. Getting angry at Google or Facebook for “throwing” the election by permitting fake news on their sites is a pretty big waste of breath.

Consider the complaints. Facebook repeatedly tweaks its algorithm to impact how news trends, for which it recently faced a fair backlash, but that does not equate to Facebook making up false stories that show up on the site. And it would hurt Facebook’s business model to try to decide what’s real and what’s not.

As Rabkin explains:

Facebook didn’t invent rumor-mongering. It doubtless has made the problem more visible, since what used to be merely asserted drunkenly in saloons or spoken on talk radio is now in publicly visible text online. But visibility is not the same as impact and we should not assume without evidence that technology has made false rumors more dangerous to society. (The election of Donald Trump is not evidence that falsehood has any new potency. Partisans have been repeating lies about their opposition since the birth of democracy.) …

Google and Facebook have a deep ethos of neutrality, and to the extent that they are credible, it is precisely because they do not make blatant editorial decisions that embed their preconceptions and beliefs about which sources to trust. If Google or Facebook were to anoint some limited set of news sources as “authoritative” and some others as “fake,” they would immediately be faced with quite an ugly controversy about who is who, and this is controversy they avoid for both business and philosophical reasons.

Getting to the top of Facebook or Google search returns is a contest, and contestants know how to play the game.

This is the era of digital marketing, where getting seen is as important as what is said. Many players are vying for the top spot, and are willing to pay for it. An entire industry has made its fortune teaching other businesses how to rank up the Google pages. They game and test and look at data to learn how to outbid their competition to get to that spot.

This is how these platforms make their money, and they aren’t going to jeopardize the funding stream. So while Facebook and Google may constantly be rewriting and reframing their algorithms to try to second-guess what people are looking for to be able to deliver that to them, there are many, many guardians at the gate willing to point out what these platforms are doing wrong.

To wit: Being the editors of quality news is not the job description for Facebook and Google engineers.

If users are seeking carefully curated news, The New York Times and The Wall Street Journal are both available online, and there is no particular reason why Google ought to compete directly against them.

Americans do want reliable information on which to form opinions, it’s in their best interest to have all the competing arguments coming at them, good and bad. This involves becoming educated, not just by what’s on the screen, but what is in books, what occurs in real-life experiences and involves real-life witnesses.

Anybody can put anything on the Internet, for better or worse. It’s our responsibility as members of society to be able to develop and express well-considered, well-formed, and well-sourced positions.

And for all its faults, America was not “hacked” into electing Donald Trump. Some Americans may have believed fake news and used it to form their opinions, but that is not what “hacking” is. No evidence points to machines having been tampered with, despite Trump’s pre-victory claims that it could happen. The Wisconsin and Pennsylvania recounts requested by Green Party candidate Jill Stein only reinforce the validity of the vote.

So let’s be vigilant thinkers and put a little effort into determining the quality of information on which we form our opinions. We’ve no one to blame but ourselves if we fault the machines for doing a poor job of thinking for us.

Read Rabkin’s entire article on TechPolicyDaily.com.