A chat with the Father of the Web

Sir Tim Berners-Lee doesn’t sound like a legend on the phone. He sounds like a friendly, slightly absent-minded scientist — which isn’t surprising, since that’s pretty much what he was before he invented the World Wide Web in 1990 while working at a particle-physics research lab in Switzerland, and along the way became a legend. He says he has a rule that anyone who calls him Sir (he received a knighthood from the Queen in 2004) “has to buy a round of drinks.”

In a phone call from Banff, where he was taking part in the 16th annual International World Wide Web Conference, Sir Tim talked about what he sees as the future of the “semantic Web” — in which not just websites will be connected together, but all kinds of data everywhere will be interconnected — and also some of the things he thinks could put the future of the Web at risk, such as the potential for large telecom companies to try and control the flow of data.

Globe: What sorts of things are you talking about at the conference?

TBL: “There are a number of trends happening on the Web. For example, there are pragmatic trends, such as the fact that we’re starting to see people using all kinds of small, portable devices to access the Web, as well as now huge screens. So the question is how do you make a web site that takes advantage of the big screen, when you’re planning a trip or whatever, but still works on a small screen when you’re checking your flights. The Web is also getting more into developing countries, so the number of people and the number of cultures on the Web is exploding… so that’s exciting. And then there’s the work we’re doing on the idea of the semantic Web.”

Globe: What do you mean when you use the term “semantic Web?”

: “It’s a way of taking the data that is in lots and lots of different systems and connecting it together — for example, in a company or a database — and not just connecting it together, but realizing that it’s part of a community, that there are partners and suppliers and customers who all want to see and use this data in different ways. There’s a lot of excitement in the life sciences about doing this, where there are scientists looking for drugs and so on, and they have huge amounts of different sources of data. They’re looking for creative solutions to medical problems, but not everyone who is working on the problem has access to the right data.”

Globe: How can the semantic Web help with that kind of problem?

: “Think about the files on your desktop, on your PC or your laptop — some of the files are like letters you have written, and you can put them on the Web quite easily. But other files like your address book if you put it on the Web it would just be a big list of people, and the same thing with your calendar, it would just be dates and times. But it’s not just the names or dates that are important, it’s the relationships between the data, so you can find all the people who are available at a certain time and book a meeting, and so on.”

: So the semantic Web would allow people to create data “mashups” more easily?

: “Absolutely. The problem with mashups is that they are usually a website where someone has taken all the coffee houses in Calgary and put them on a map. And that’s all very well — but if you want to look at all the music stores, or find where your friends are, you have to use a different website. So what we have with the semantic Web is a common data format or RDF, where instead of having to write a mashup for each of those things, you just pull the data in when you want it. Suppose someone has information about bus stops on the map, and you want to click and go to the bus timetable, but you don’t want just the timetable, you want the bus that you have to take, and then you want to put it in your calendar, and then tell your friend where to meet you and so on. Whenever you’re planning what to do for the afternoon, what you’re doing is a data integration exercise.”

Globe: Would enabling the semantic Web take a lot of work?

TBL: “Not a lot of work, no — but a little work. What I tell people is they can keep their database running the way it is, but then they can plug onto it a little thing that provides access to the semantic Web… it’s just a thin layer of software that provides a semantic Web connection, like an adapter. Let’s say you have a company and you have a product catalog, with all kinds of nuts and bolts and so on, all collected in a database. You could just write some software to give each one of your nuts and bolts a URI (uniform resource **, or Web address) and then when people search they will get back data about the product and then they can compare them and do all sorts of things with that data.”

Globe: Is the idea of a semantic Web gaining traction?

TBL: “It is, but I realized recently that selling the whole world on it at once isn’t the way to go. The idea is just to get it to a certain point where about 10 per cent of a creative community is interested in it, and then let it go from there. That’s what happened with the Web originally — I was lucky enough to be working in the high-energy physics field, and so I went to all the high-energy physics conferences and kept talking about it, and eventually about 10 per cent of them got interested and pretty soon most had a website or a browser and so on. And now it’s the life sciences that are really getting excited about the semantic Web, and there are some interesting parallels: in 1990 high-energy physics was where it was at when it came to cool science, and now in 2007 the life sciences are the cool science. And they have huge amounts of data, and some of the same frustrations I felt that led to me developing the Web.”

Globe: What do you think of the interest in what some are calling “Web 2.0?”

TBL: “I think Web 2.0 has the goals of helping people work together, helping people be creative, and those are very much the original goals of the Web. As I see it, the term refers to a particular type of site in which users put data into the system, whether photos or tunes or whatever, and then the system repurposes that data… and as a result the data actually becomes more valuable. The frustration for some people is when you make a choice of a particular website and then you go to a different website and it’s a completely separate world, and you have to start all over again… What we’re just starting to see is that geeks are doing the aggregation of all this data themselves, pulling it all down and then using tools to provide different slices of it in different ways, and others are signing up to websites that do that for you, so you sign in with your Flickr user name and then all your photos are available to you somewhere else.”

Globe: And would the semantic Web make that kind of thing easier?

TBL: “Using semantic Web protocols would absolutely make that a lot easier. Then life with data will become a web, a cross-linked web of information, just as we expect it to be.”

Globe: Is part of the problem that some companies want to create their own “walled gardens” of data?

TBL: “Yes. And there has always been a tension between people who want to build walled gardens and those who want to be part of the world wide web. Even in the early days of the Internet there were things like Prodigy and so on, where they tried to keep users there. But no one walled garden can be as strong and powerful and exciting as everyone else together can be. There has always been a business case for trying to keep people separate, but the mass of humanity is always more exciting.”

Globe: Is there a risk that ISPs will try to control the flow of data on the Web?

TBL: “It is a serious risk, yes. But I am optimistic — I think that the public will guard very jealously their right to connect to whoever they want. And I think the Web is making things better for many businesses. I think it will, for example, dramatically refresh the movie industry, with the “long tail” and so on, where all the little movies that no one ever sees or are in an obscure language or about obscure subjects can be seen by anyone. The industry is becoming much more exciting, and in a way it’s becoming a more natural market, where the market is basically flatter — where a smaller person has the ability to put up their independent short film and have it reviewed by people. And that’s good. But if ISPs are controlled or have a business relationship such that they won’t allow people to see anything but their movies, then that is not good.”

Globe: Does that apply to using “bandwidth shaping” to give priority to certain data?

: ” I think when you’re degrading peer-to-peer file-sharing packets in favour of voice, that’s very reasonable, because peer-to-peer doesn’t need instant delivery and voice does. The problem is when the cable company says I’m sorry, we’re not going to allow you to connect to somewhere and watch movies from a competitor because we sell our own movies through the cable and you have to watch those… ISPs shouldn’t select which movies you can watch or which website you go to, or which religious or political sites you can go to.”

