Careful with that data, Eugene

by Mathew on February 16, 2008 · Comments

Update:

Heather Hopkins has an update to her post in which she goes into a bit more detail about where the data for the chart comes from. And just to be clear, I wasn’t suggesting that the data is wrong — just that it’s probably not wise to be drawing sweeping conclusions from it.

Original post:

There’s a Hitwise traffic report up that looks at (or purports to look at) the demographic differences between those who spend most of their time at Yahoo vs. those who spend most of their time at Google — a report that led Duncan Riley at TechCrunch to boldly proclaim that “poor people use Yahoo, and those better off use Google.” While I am not a sociologist or market researcher, I have to say that multiple warning bells went off when I read this. I think there are a whole series of leaps of logic going on, most of which are based on not a lot of evidence.

Like most such research, the chart included with the Hitwise report looks great — colourful blobs represent different demographic groups with catchy names like “Blue-collar Backbone” and “Small-town Contentment,” which are clustered at various points on a graph with Yahoo on one axis and Google on the other. “Struggling societies” is at one extreme and “Affluent Suburbia” is at the other. So that’s it, right? End of story.

hitwise1.jpg

Except that there are some pretty large holes in all of this — or at least information that we don’t have. According to the Hitwise report, the demographic groups with the cute names are based on “offline data” from Experian, which appears to be a credit reporting agency with a marketing consulting arm (Note: Experian apparently acquired Hitwise last year). Where does that information come from and how reliable is it? That’s one question. Then there are the blobs: their size apparently refers to how many in that group have spent $500 or more online.

But what does that tell us? It tells us whether people who spend money online (or at least people who say they spend money online) are more likely to use Google or Yahoo, not whether those people are rich or poor. Maybe it’s just geeks vs. non-geeks, as more than one commenter on TechCrunch pointed out. In any case, we have one questionable set of data about spending habits balanced on top of another set of questionable data about demographic groups. Not much to draw conclusions from, I wouldn’t say. Sociologist danah boyd has some thoughts that tie into her research as well.

(*The title is based on a Pink Floyd song, for no other reason than it popped into my head. No, I don’t know why).

Share:
  • Digg
  • del.icio.us
  • Facebook
  • E-mail this story to a friend!
  • Ma.gnolia
  • NewsVine
  • SphereIt
  • StumbleUpon
  • Duncan Riley was the reason I stopped reading Techcrunch sometime late last summer. A blog I used to follow religiously.
    Some time ago I created my own TC-Feed that delivers only the posts from Arrington, sorta oldschool-TC.

    Kinde offtopic, sorry. :)

    You're right btw.
  • Thanks for the tip, Marcel :-)
  • yeah, you can use yahoo pipes for that stuff. but feedrinse is easier and faster set up for just filtering single feeds.
    Especially helpful with all the successful blogs turning into small teamdriven onlinepublications with more daily posts than actually necessary. (mashable back then was the first blog there I asked myself wether they've forgotten that they also act as a filter etc)
  • Is that a different Experian than the credit agency?
  • Same company -- it has a consumer-demographic analytics arm as well as
    the credit arm.
  • Perfect post title Mathew IMO and I felt much the same as you did when I first read the post except my reaction was more of WTFATTA - or do they even know.
  • A reasonable analogy might be the people who shop on the home shopping channels. Sure they often spend a lot of money but that rarely has anything to do with their income level.
  • I agree, Daniel. That's a good analogy.
  • Experian is the US's largest credit database firm. Anyone in the US (Canada?) who's ever applied for a credit card, home loan, college loans, medical loans, apartment rental application, is in their system. Your profile includes everything you've put on your application, all sorts of information pulled at random and provided by your banks, credit unions, credit card companies, and retailers. Employers also use Experian and others like them as part of background checks.

    Their demographics unit has humongous amounts of data from which to extract trends.

    So the real question is why, among all the conclusions about behavior they might have drawn, did they choose this one? Why not MSN vs. Yahoo, or any of thousands of other distinctions? Do they have a specific agenda? Is Yahoo! or Google a customer?
  • I have no doubt that they have lots of purchasing data and demographic
    profiles from their credit arm -- but how do they know whether people
    spent more than 500 dollars online or whether they use Google or
    Yahoo? Do they track people's behaviour online and watch every site
    they spend money on? I doubt it. In fact, I'm pretty sure that would
    be illegal. As far as I can tell, they do surveys. And surveys are
    notoriously unreliable. They often tend to show whatever you want them
    to show.
  • I'd argue Equifax is larger than Experian... here is scary... back in my old corporate days, I built a customer/consumer database with about 9-10 million of our customers. Experian came in and said they could matchup name/email to their database and give us 30-40 more points of data on each record in the db from credit info to salary, etc. And we think Beacon is bad :)
  • nice one Mathew. You'll enjoy this. Found via Federman

    http://medicine.plosjournals.org/perlserv/?requ...
  • Thanks, Leigh. That certainly tends to support my own hypothesis --
    which is, of course, based on virtually no facts and mostly a lot of
    conjecture and anecdotal evidence. :-)
  • safex982
    Well, i like this post.
blog comments powered by Disqus

Older post: Google: The new heavy industry

Newer post: Bloggers need to try even harder