Careful with that data, Eugene


Heather Hopkins has an update to her post in which she goes into a bit more detail about where the data for the chart comes from. And just to be clear, I wasn’t suggesting that the data is wrong — just that it’s probably not wise to be drawing sweeping conclusions from it.

Original post:

There’s a Hitwise traffic report up that looks at (or purports to look at) the demographic differences between those who spend most of their time at Yahoo vs. those who spend most of their time at Google — a report that led Duncan Riley at TechCrunch to boldly proclaim that “poor people use Yahoo, and those better off use Google.” While I am not a sociologist or market researcher, I have to say that multiple warning bells went off when I read this. I think there are a whole series of leaps of logic going on, most of which are based on not a lot of evidence.

Like most such research, the chart included with the Hitwise report looks great — colourful blobs represent different demographic groups with catchy names like “Blue-collar Backbone” and “Small-town Contentment,” which are clustered at various points on a graph with Yahoo on one axis and Google on the other. “Struggling societies” is at one extreme and “Affluent Suburbia” is at the other. So that’s it, right? End of story.


Except that there are some pretty large holes in all of this — or at least information that we don’t have. According to the Hitwise report, the demographic groups with the cute names are based on “offline data” from Experian, which appears to be a credit reporting agency with a marketing consulting arm (Note: Experian apparently acquired Hitwise last year). Where does that information come from and how reliable is it? That’s one question. Then there are the blobs: their size apparently refers to how many in that group have spent $500 or more online.

But what does that tell us? It tells us whether people who spend money online (or at least people who say they spend money online) are more likely to use Google or Yahoo, not whether those people are rich or poor. Maybe it’s just geeks vs. non-geeks, as more than one commenter on TechCrunch pointed out. In any case, we have one questionable set of data about spending habits balanced on top of another set of questionable data about demographic groups. Not much to draw conclusions from, I wouldn’t say. Sociologist danah boyd has some thoughts that tie into her research as well.

(*The title is based on a Pink Floyd song, for no other reason than it popped into my head. No, I don’t know why).

Is the Web bubble back? Ask Hitwise

From the London Telegraph comes a rumour that Hitwise — one of the half a dozen web-traffic measurement companies whose stats show up in press releases, and are used as fuel for takeover rumours — is itself the subject of takeover talks, with the price tag reportedly an eye-popping 180 million pounds or about $350-million (U.S.). Joe Duck says this sounds about right if Hitwise charges its 1,200 or so clients an average of $2,500 a month for access to its data.

I’m not sure where Joe gets those numbers from, but let’s assume he’s right. That works out to annual revenue of about $36-million, which makes the rumoured takeover price between 9 and 10 times revenue. Joe says that’s “not outrageous” for an established and growing Internet company, which leads me to believe one thing — no, not that Joe is on crack, but that he has a very high threshold for outrage.


I think between 9 and 10 times revenue is bubble-type math. And yes, I know that Google sells for 15 times revenue; in fact, that actually helps my case. Obviously, traffic measurement is a hot area right now, primarily because advertisers are desperate to find a way of deciding where to put their money, and websites are desperate to find a way of proving they are the right place to put it.

Using page views as a metric, as Steve Rubel notes, is broken. But then, the different standards used by Hitwise and comScore and Nielsen and Alexa aren’t much better. As Matt Marshall pointed out, website measurement as a whole is a train wreck. Alexa only measures users who install a browser plugin and is biased towards the U.S.; comScore uses a piece of software that has been accused of being spyware; Nielsen phones people and asks them what they do; and Hitwise uses ISP log files.

What you typically wind up with is half a dozen measurements that all say something different — in some cases, one firm will show a website falling in popularity or flat, while another shows its traffic zooming. Is Hitwise any better than its competitors? Who knows. But any way you slice it, 9 or 10 times revenue is a boatload of cash.

It’s a Web-traffic-counting traffic jam

Matt Marshall over at SiliconBeat makes a point that is definitely worth making — and one that apparently has to be made over and over again before people get it — which is that Web analytics is (to put it mildly) an inexact science. In fact, looking at the Web-traffic numbers reported by Hitwise, Alexa, Nielsen and Comscore makes the weather-forecasting business look precise and infallible. This is an issue that has come up in the past with MySpace and its growth (as I discussed here) and has now come up again with respect to

The dancing around in Marshall Kirkpatrick’s recent post at TechCrunch is almost comical, although to be fair at least Marshall is trying to get the story straight. He notes that Mike Arrington wrote about awhile back and was critical because its traffic was stagnating, but then had a chat with creator Josh Schachter and some Yahoo folks (I’m sure no bright lights or sleep deprivation was involved — Yahoo is much more subtle) and now TechCrunch is convinced by a Hitwise report that traffic has doubled.

Stagnating, doubling — tomato, tomahto, right? To his credit, Marshall goes out of his way to note that while Hitwise is a “respected” traffic analysis firm, numbers are all over the map — and he links to the other Marshall’s critique of the field. The simple fact is that Hitwise, Comscore, Nielsen and Alexa all use different methodologies (a good description here) and as a result they are not just talking about apples and oranges, they are talking about apples and oranges and plums and peaches.

When you’re trying to make apple sauce, that’s kind of a problem — and unfortunately all it means is that websites can use whatever data they want to tell whatever story they want, and various blogs and media will lap it up.

MySpace might be bigger — or not

Like my old-media colleague Mark Evans, I’m skeptical of the somewhat boosterish (to put it mildly) headlines about the growth of MySpace, and how it is now supposedly a larger Internet property than Yahoo, according to figures from Internet traffic-tracking firm Hitwise. And yet, the numbers from Comscore/Media Metrix and Nielsen/NetRatings don’t show anything like that — at least not when it comes to unique visitors.

According to a statement from Hitwise about methodology, the company uses a “network-centric” measuring process, in which traffic data is collected directly from ISPs using the company’s proprietary software. Other tracking services such as Comscore and Nielsen measure traffic based on software that users install (Alexa uses this method as well), phone surveys and/or through software trackers installed at websites directly. Naturally, Hitwise says its way is better.

The Hitwise release about MySpace (which was just for the first two weeks of July) didn’t give specific numbers for the social-networking service. Instead, it said that MySpace’s “market share of visits” was higher than Yahoo’s at 4.46 per cent. It’s not clear what that phrase refers to, but it appears to be a lot closer to raw hits than it is to unique visitors. Part of the problem seems to be that Hitwise only tracked Yahoo’s email domain, and left out its search and portal properties. According to Yahoo, it had 129 million unique visitors in June (for Yahoo’s search, email and web properties), and MySpace had 52 million.

I hate to rehash something that I thought we had all hoisted aboard during the first Web bubble, but raw traffic is a crappy measure of anything (and has been criticized for having a design that boosts page-hit counts). That’s why unique visitors and other metrics get used more often. Unfortunately, that doesn’t stop newspapers — and blogs, unfortunately — from trumpeting the “XYZ Corp. is the biggest!” headlines whenever there’s a slow news day.

For an interesting look at the differences between Hitwise numbers and those from ComScore/Media Metrix and Nielsen/Net Ratings, check out this comment from Flickr founder Stewart Butterfield on a recent post at Paul Kedrosky’s blog.

Is Photobucket Web 2.0?

I’ve been meaning to blog about something for a few days now, but various events in my personal life (including a move to a new house and a sick family member) have kept me from doing so. The something I wanted to blog about was a post by LeeAnn Prescott of the Web-tracking firm Hitwise, which looked at the traffic stats for various photo sites, including Flickr and Shutterfly (which is controlled by former Netscape CEO Jim Clark and has filed to go public).

One of the interesting things about the numbers LeeAnn provided, which drew a lot of commentary on, was that Flickr — despite being by far the most widely talked about photo site, at least from a Web 2.0 perspective — came in fairly far down on the list of top 10 photo sites. Number one by a landslide was a site hardly anyone talks about: Photobucket, which (unless I’m mistaken) gets the vast majority of its traffic from MySpace and other social networking sites, by providing an easy photo hosting service for blogs.

LeeAnn’s Hitwise item sparked a fairly extensive response from Flickr co-founder Stewart Butterfield, who tried to post a comment on TechCrunch but apparently had difficulty getting it past the spam filter. I wound up seeing his comment a day or two later on Paul Kedrosky’s blog. Paul liked Stewart’s comment so much that he later elevated it to post status.

Stewart’s comment/post is worth reading, if only to see the (in some cases) large discrepancies between Hitwise traffic numbers and those from Comscore Media Metrix and Nielsen/NetRatings. But it also brings up the issue of whether Photobucket and Flickr really compete or not. One is a community — Web 2.0 if you will — and one is just a hosting service, which is more Web 1.0. And yet Photobucket is the plumbing behind a very Web 2.0 service such as MySpace, and it has 48 per cent market share and is still growing.