April 17, 2009

The Unreasonable Effectiveness of Google

GoogleVia the Official Google Research Blog at the University of Google, Alon Halevy, Peter Norvig and Fernando Pereira have published an interesting expert opinion piece in the  March/April 2009 edition of IEEE Intelligent Systems: computer.org/intelligent. The paper talks about embracing complexity and making use of the “the unreasonable effectiveness of data” [1] drawing analogies with the “unreasonable effectiveness of mathematics” [2]. There is plenty to agree and disagree with in this provocative article which makes it an entertaining read. So what can we learn from those expert Googlers in the Googleplex? (more…)

November 30, 2007

Burn semantic Web, Burn!

Taking down A.I. town?

Danger! Religious Wars!The Semantic Web is (quote) “a new form of Web content that is meaningful to computers”. It will “unleash a revolution of new possibilities” using a magical “new” artificially intelligent technology called ontology. So says a much-cited article in Scientific American published back in May 2001. Most people who have read this article, fall into two camps: “believers” and “non-believers”. Let me tell you a short story about a religious war between these two groups…

An Old War Story: Chapter 1

This is a work of fiction, though as they say in Hollywood it is “based on a true story”. Characters names are real.

A crusade of semantic web believers, is started by three people called Jim Hendler, Ora Lassila and Tim Berners-Lee. At the heart of their faith is a holy scripture and a suite of sacred technology called the semantic web stack. If people use this technology, the crusaders believe, the Web would be a better place. Search engines like Google, for example, would be even smarter than they already are, because they would intelligently “know what you mean“, when you type your keywords. All this new magic comes from using good old fashioned logic, metadata and reasoning. Better Search Engines is one of the mantras of the semantic web troops as they pour onto the battlefield towards the promised land. Viva la Webolution! Charge!

A counter-attack is launched by the non-believers of this vision of the future. They rally behind a man called Clay Shirky who roars “the semantic web is doomed” at the top of his voice. Many others echo Shirky’s sentiment, including Peter Norvig, Rob McCool, Cory Doctorow and Tim O’Reilly. General Shirky makes powerful allies in battle, and he has a two-pronged attack. “Ontology is over-rated” he jeers. Led by Shirky, the non-believers capture the sacred technology, add their own firewood and put the torch to it in a very public place. The flames leap into the sky, visible for miles around.

“Burn semantic web, burn!” the non-believers cry as they gleefully dance around the fire.

The battle rages, the believers will not take this heresy lying down. They regroup and surge forward again. Death to the blasphemers! With the help of some biologists, they seek revenge using the Gene Ontology as deadly ammunition. The non-believers are confused by this tactic, they don’t know what genes are and neither do the biologists. Unfortunately, the biologists unwittingly find themselves in the middle of an epic battle they didn’t start. There are ugly skirmishes involving logic and graph theory. Dormant and hideous A.I. monsters are resurrected from their caves, where they spent the A.I. winter. These gruesome monsters make the Balrog beast from Lord of the Rings look like a childrens cuddly toy.

From the relative safety of their command centres, the leaders orchestrating the war look on. Many foot soldiers and PhD students have been slayed on the field of battle, tragic young victims of the holy war. Understandably the crusaders are unhappy. Jim Hendler isn’t pleased as he surveys the carnage and devasation. Ora Lassila is also disappointed.

“We never said that, you completely minsunderstood. You are all burning the wrong thing, using fuel we never gave you. You lied, you cheated, you faked, you changed the stakes!”

There is a lull in battle. But confusion reigns, especially among the innocent civilians and bewildered biologists.

(End of chapter 1)


As of the winter of 2007, the semantic web fire is still burning. While I warm myself next to it, using all the juicy metadata as material for my PhD, it is still too early to predict just how useful the technology is going to be. It doesn’t really matter if you’re a “believer”, a “non-believer” or completely agnostic about the semantic web. The religious war beween the two sides tells you more about human behaviour, than it does about the utility of the technology. Optimists profit from making bold claims to get noticed on the battlefield. Critics are more cynical, furthering their own careers by countering the optimists claims. Other people interpret the interpretations of the cynics second-hand. Thanks to cumulative error, or the Chinese whispers effect, everyone gets really upset. The original optimists vision has been changed in ways they didn’t expect.

It’s a very natural and human story amidst all the “artificial” machine intelligence.

Ora, Jim and Tim have done quite well out of the fighting. Google Scholar reckons their original article has been cited nearly 5000 times. That is a lot of attention, in scientific circles, a veritable blockbuster hit. At the time of writing, not even Albert Einstein can match that, and his ideas are much more important than the semantic web probably ever will be. Many good scientists with important ideas can only dream of publishing a paper that is as heavily cited as that infamous Scientific American article. So which do you think would most scientists prefer:

  • Being internationally known and talked about, but misunderstood by large groups of people?
  • Being relatively unknown, ignored but well understood by a small and obscure group of people?

Neither is ideal but I think in most cases, there is only one thing in the world worse than being talked about, and that is not being talked about.

We have reached the end of chapter 1 of this little story. Wouldn’t it be nice if Chapter 2 was less bloody? Perhaps the two sides could focus more on facts and evidence, rather than the beliefs, opinions, marketing, hype and “visions” that have dominated the battle so far. As the winter solstice approaches and the new year beckons, can we give peace, diplomacy and above all SCIENCE a chance?

The Moral of the Story (so far)

The moral of this old war story is simple. Religions of various kinds have been known to make people commit horrendous and completely unreasonable war crimes. Nobody is innocent. So if you don’t like a fight, steer well clear of religious wars.


  1. The “burn” idea comes from Leftfield with John Lydon (1995) Open Up “Burn Hollywood, Burn! Taking down Tinseltown
  2. Thanks to Carole for the idea of using fiction to illustrate science see Carole Goble and Chris Wroe (2005) The Montagues and the Capulets: In fair Genomics, where we lay our scene… Comparative and Functional Genomics 5(8):623-632 DOI:10.1002/cfg.442 seeAlso Shakespearean Genomics: a plague on both your houses)
  3. This post, originally published on nodalpoint

July 21, 2006

AAAI: Dude, Where’s My Service?

GogloAs the number of bioinformatics services on the web increases, finding a tool or database that performs the task you require can be problematic. At the AAAI poster session on Wednesday, I presented our paper describing a novel solution to this problem. It uses a reasoner to “intelligently” search for web services, by semantically matching service requests with advertisements and has some advantages over comparable solutions…

I won’t go into all the gory details here but our technique extends and complements current approaches for matchmaking services. Some of the key features described in the paper are that it allows you describe to relationship(s) between the input and output of a service. E.g. What is the relationship between the input and output protein sequence of InterProScan? This relationship can help match requests for services with their adverts with higher precision and recall. I don’t mind admitting its been hard work getting this research published because a large part of the AI community use shamelessly toy and fictitious scenarios to motivate their work. Then they build incredibly complicated software stacks that are only understood by the small clique of people that designed them. When you show some of these people real-world bioinformatics services, they don’t seem to care too much, preferring to bury their heads in the sand of make-believe. There, thats got it off my chest!

So it was re-assuring when people came by the poster, listened to my speel and asked lots of questions. Ora Lassila from Nokia (one of the people responsible for hyping the whole idea up in the first place) dropped by to have a look. He was interested in adapting the technique for locating services in a registry, used by mobile devices. (I wonder if anyone out there needs BLAST on their mobile phone?!) It was good to meet Ora, and talk about semantics.

There is nothing quite like standing in front of a poster for three hours and tirelessly explaining it to complete strangers who work in disparate fields. It certainly helps to get your ideas straight. Where would we be without conferences?


  1. Danny Leiner (2000) Dude, Where’s My Car?
  2. Massimo Paolucci, Takahiro Kawamura, Terry Payne and Katia Sycara (2002) Semantic Matching of Web Service Capabilities
  3. Duncan Hull, Evgeny Zolin, Andrey Bovykin, Ian Horrocks, Ulrike Sattler and Robert Stevens (2006) Deciding Semantic Matching of Stateless Services in the Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06)

Blog at WordPress.com.