The New York Times was the first major newspaper to take its cue from Google and open up its data via an API (which stands for application programming interface). In a nutshell, this allows developers to write programs that can automatically access the New York Times database, within certain limits, and use that data in mashups, etc. Now the Guardian newspaper in Britain has upped the ante: not only has it opened its data up via an API, but it has also done two things that the NYT has not — namely, it provides the full text of its articles to users of the API (while the Times restricts developers to an excerpt only) and it also allows the data to be used in for-profit ventures, while the Times restricts its data to non-profit purposes.
As Shafqat at NewsCred notes on his blog, these two differences are pretty important, and I would argue that the Guardian has really put its money where its mouth is in terms of turning its paper into a platform (to use the title of a blog post I wrote when the NYT came out with its open API). Not to denigrate what the Times has done at all, mind you — an API of any kind is a huge leap, and one that many newspapers likely wouldn’t have the guts to take, limits or no limits. But to provide full-text access to all Guardian news articles going back to 1999, and to allow all of this data and more to be used in profit-making ventures as well, takes the whole effort to another level entirely.
(read the rest of this post at the Nieman Journalism Lab blog)
There’s been a lot of chatter about the newspaper industry in recent weeks — about whether newspaper companies should find something like iTunes, or use micropayments as a way to charge people for the news, or sue Google, or all of the above — and how journalism is at risk because newspapers are dying. But there’s been very little discussion about something that has the potential to fundamentally change the way that newspapers function (or at least one newspaper in particular), and that is the release of the New York Times’ open API for news stories. The Times has talked about this project since last year sometime, and it has finally happened; as developer Derek Gottfrid describes on the Open blog, programmers and developers can now easily access 2.8 million news articles going back to 1981 (although they are only free back to 1987) and sort them based on 28 different tags, keywords and fields.
It’s possible that this kind of thing escapes the notice of traditional journalists because it involves programming, and terms like API (which stands for “application programming interface”), and is therefore not really journalism-related or even media-related, and can be understood only by nerds and geeks. But if there’s one thing that people like Adrian Holovaty (lead developer of Django and founder of Everyblock) have shown us, it is that broadly speaking, content — including the news — is just data, and if it is properly parsed and indexed it can become something quite incredible: a kind of proto-journalism, that can be formed and shaped in dozens or even hundreds of different ways.
(read the rest of this post at GigaOm)
One of the interesting things to me about Mike Arrington’s interview with Twitter founder Evan Williams isn’t so much the discussion of business models (although that’s obviously something the company will have to deal with eventually) but the debate that seems to be going on inside the company about how it handles API access to Twitter’s data. As Mike notes, the service only gives four outside companies full access to the entire Twitter data feed, and one of those is Summize — which of course is now part of Twitter. The others are FriendFeed, Twittervision (which overlays Twitter posts on a map) and Zappos, which is an online shoe retailer that makes extensive use of Twitter.
The ability to get access to that “firehose” of XMPP data was what allowed Summize to produce @ replies from Twitter when users couldn’t get access to those replies through Twitter itself, one of the features that I think drove interest in Summize over the past few months — although it would arguably still have been a useful service even without Twitter’s repeated downtime issues. But while Summize used it to build something that made sense as an adjunct to Twitter, a service like FriendFeed.com is using the data firehose to build something that is closer to being a very real competitor. Like some other people, when Twitter was having its issues, I effectively duplicated the most important part of my Twitter follow list by following or creating friends in FriendFeed.
One reason why people pay so much attention to what Google does is that it can change the landscape with a single move. Take the whole RSS feed-syndication thing, which — despite the relative popularity of Bloglines.com and NewsGator.com and their ilk — is still in its infancy as far as the bulk of Web users are concerned. That’s why things like Yahoo adding RSS support to its email app (much as I dislike having feeds in my mail) make a difference.
Now, the ever-diligent Niall Kennedy has managed to reverse-engineer the API (application programming interface) that Google uses in its Reader application, which sparked the interest of a couple of Google staffers — who said the company is close to releasing its API for public use. (Paul Kedrosky says the API announcement is also a way for Google to deke around criticism of its reader).
I’m not a programmer, but I think this could change things dramatically. For one thing, it could make it even easier for a few smart people to come up with easy-to-use feed readers — apps that are light-years ahead of Google’s own reader, which I happen to think is lame. As Niall has pointed out, Google has already made it relatively easy to come up with an Atom feed for your blog of choice, since Google’s app takes whatever feed it is given and converts it to Atom.
As more than one person has pointed out, RSS (or Atom) is plumbing — hopefully Google’s move will make it easier for people to just use the facilities instead of worrying about what format the equipment is based on. Phil Wainwright says he expects that RSS readers as we know them will eventually disappear (or be absorbed). So maybe Scoble is too late in his attempt to get Microsoft to buy NewsGator.
I don’t want to add to the “echo chamber” that some have complained about in tech-blogging circles — which is a real risk given the number of blogs tech.memeorandum.com has commenting on the news — but I think it’s interesting that Amazon seems to have decided to open up its Alexa API for no apparent reason.
In other words, there doesn’t seem to have been any pressure to do so, nor is Amazon.com in financial trouble or under severe competitive threat — although it’s true that the company is no longer growing as quickly as it used to. That means it has decided that “opening the kimono,” as Fred Wilson likes to call it, is worth doing for some other reason (Fred calls Alexa “Amazon’s hidden jewel.”)
In all likelihood, it’s because Amazon has seen the spread of Google’s search, not to mention Google Maps, and Google Earth, and Flickr and so on, and realized that an open API likely creates more value — in the longer term — than a closed one. Let’s hope so. Because if there is one lesson that companies can learn from “Web 2.0,” it is that. Paul Kedrosky wonders why the Amazon announcement is news, and maybe it isn’t really. But it is still important.
Richard MacManus at Read/Write Web has more. And Cynthia Brumfield of IPDemocracy makes an important point (which others have made as well), which is that it isn’t just the open API, but the quality of the index that counts. And Danny Sullivan of SearchEngineWatch is underwhelmed by the news.