May 13, 2009

XML Summer School, Oxford

XML Summer School, Oxford, U.K.After a brief absence, it is good to see the XML Summer School is back again this September (20th-25th) at St. Edmund Hall, Oxford. This is  “a unique event for everyone using, designing or implementing solutions using XML and related technologies.” I’ve been both a delegate and a speaker here over the years; back in 2005, with Nick Drummond we presented the Protégé and OWL tutorial which was good fun.  So here is what I.M.H.O. makes the XML summer school worth a look: (more…)

May 6, 2009

Michel Dumontier on Representing Biochemistry

Michel Dumontier by Tom HeathMichel Dumontier is visiting Manchester this week, he will be doing a seminar on Monday 11th of May,  here are some details for anyone who is interested in attending:

Title: Increasingly Accurate Representation of Biochemistry

Speaker: Michel Dumontier, dumontierlab.com

Time: 14.00, Monday 11th May 2009
Venue: Atlas 1, Kilburn Building, University of Manchester, number 39 on the Google Campus Map

Abstract: Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate manner. A fundamental starting point is biochemical identity, but our current approach for generating identifiers is haphazard and consequently integrating data is error-prone. I will discuss plausible structure-based strategies for biochemical identity whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups) such that identifiers may be generated in an automatic and curator/database independent manner. With structure-based identifiers in hand, we will be in a position to more accurately capture context-specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, our current representation of biochemical knowledge may improve such that manual and automatic methods of biocuration are substantially more accurate.

Update: Slides are now available via SlideShare.

[Creative Commons licensed picture of Michel in action at ISWC 2008 from Tom Heath]


  1. Michel Dumontier and Natalia Villanueva-Rosales (2009) Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics DOI:10.1093/bib/bbn056
  2. Doug Howe et al (2008) Big data: The future of biocuration Nature 455, 47-50 doi:10.1038/455047a

October 27, 2008

OWL Experiences and Directions (OWLED) 2008

Great Grey Owl by Brian ScottThe Web Ontology Language (OWL) is a language for creating ontologies on the Web. It does exactly what it says on the tin. But what is an ontology? One way to think of it is as a better way of storing data and knowledge. Instead of just capturing and describing data in a databases, ontology languages like OWL provide ways to capture and describe knowledge in a knowledge base. Ontologies can allow more intelligent querying, integration and understanding of data than is possible using a plain old relational database.

Since 2003 developers and users of the Web Ontology Language, abbreviated to OWL (not WOL), have been gathering at a two-day workshop called OWLED (OWL Experiences and Directions). This year the workshop is in Karlsruhe in Germany. The full list of accepted papers is available, as with previous years, this years workshop has a distinctly biological flavour to the proceedings: (more…)

July 6, 2008

You Know OBO? Let’s GO!

Oboe mechanics by starriseAccording to their website “The Open Biomedical Ontologies (OBO) Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain”. This week they are having a workshop in Cambridge, to bring myself up to speed, here is a quick name check of some of the people involved.

May 21, 2008

First ChEBI workshop, Day one

Great Chesterford
Some notes from day one of the first ChEBI workshop, 19th May 2008. There were four talks from Colin Batchelor (Royal Society of Chemistry), Ulrike Witting (EML Research GmbH Hiedelberg), Giles Weaver (Unilever) and Paula de Matos (EBI). Christoph Steinbeck has already written some ChEBI notes, these just add a little more detail. (more…)

May 15, 2008

BBC: Building a Better ChEBI

molecule by vabellon, on FlickrChemical Entitites of Biological Interest, ChEBI, is a freely available dictionary [1] of molecular entities, especially small chemical compounds. Like all big dictionaries and ontologies, it has its own unique challenges. Fortunately, those nice people at the EBI are holding a workshop to discuss future developments in ChEBI. In preparation for the workshop, here are some brief notes on how ChEBI could be made better. [Disclaimer: I’m fairly new to ChEBI and “thinking out loud” here, add comments below if I’ve said anything stupid or wrong]


April 10, 2008

Would you like to share my toothbrush?

Toothbrush Lovers by Evan RomineMichael Ashburner at the University of Cambridge once famously quipped that “Biologists would rather share their toothbrush than share a gene name” [1]. And so we have many different colourful and imaginative names for genes. The same mis-naming rule applies the reactants and products (input and output) of metabolism. Here are some example names, would you like to share my toothbrush chemical name? There are so many different toothbrushes names to choose from…


November 30, 2007

Burn semantic Web, Burn!

Taking down A.I. town?

Danger! Religious Wars!The Semantic Web is (quote) “a new form of Web content that is meaningful to computers”. It will “unleash a revolution of new possibilities” using a magical “new” artificially intelligent technology called ontology. So says a much-cited article in Scientific American published back in May 2001. Most people who have read this article, fall into two camps: “believers” and “non-believers”. Let me tell you a short story about a religious war between these two groups…

An Old War Story: Chapter 1

This is a work of fiction, though as they say in Hollywood it is “based on a true story”. Characters names are real.

A crusade of semantic web believers, is started by three people called Jim Hendler, Ora Lassila and Tim Berners-Lee. At the heart of their faith is a holy scripture and a suite of sacred technology called the semantic web stack. If people use this technology, the crusaders believe, the Web would be a better place. Search engines like Google, for example, would be even smarter than they already are, because they would intelligently “know what you mean“, when you type your keywords. All this new magic comes from using good old fashioned logic, metadata and reasoning. Better Search Engines is one of the mantras of the semantic web troops as they pour onto the battlefield towards the promised land. Viva la Webolution! Charge!

A counter-attack is launched by the non-believers of this vision of the future. They rally behind a man called Clay Shirky who roars “the semantic web is doomed” at the top of his voice. Many others echo Shirky’s sentiment, including Peter Norvig, Rob McCool, Cory Doctorow and Tim O’Reilly. General Shirky makes powerful allies in battle, and he has a two-pronged attack. “Ontology is over-rated” he jeers. Led by Shirky, the non-believers capture the sacred technology, add their own firewood and put the torch to it in a very public place. The flames leap into the sky, visible for miles around.

“Burn semantic web, burn!” the non-believers cry as they gleefully dance around the fire.

The battle rages, the believers will not take this heresy lying down. They regroup and surge forward again. Death to the blasphemers! With the help of some biologists, they seek revenge using the Gene Ontology as deadly ammunition. The non-believers are confused by this tactic, they don’t know what genes are and neither do the biologists. Unfortunately, the biologists unwittingly find themselves in the middle of an epic battle they didn’t start. There are ugly skirmishes involving logic and graph theory. Dormant and hideous A.I. monsters are resurrected from their caves, where they spent the A.I. winter. These gruesome monsters make the Balrog beast from Lord of the Rings look like a childrens cuddly toy.

From the relative safety of their command centres, the leaders orchestrating the war look on. Many foot soldiers and PhD students have been slayed on the field of battle, tragic young victims of the holy war. Understandably the crusaders are unhappy. Jim Hendler isn’t pleased as he surveys the carnage and devasation. Ora Lassila is also disappointed.

“We never said that, you completely minsunderstood. You are all burning the wrong thing, using fuel we never gave you. You lied, you cheated, you faked, you changed the stakes!”

There is a lull in battle. But confusion reigns, especially among the innocent civilians and bewildered biologists.

(End of chapter 1)


As of the winter of 2007, the semantic web fire is still burning. While I warm myself next to it, using all the juicy metadata as material for my PhD, it is still too early to predict just how useful the technology is going to be. It doesn’t really matter if you’re a “believer”, a “non-believer” or completely agnostic about the semantic web. The religious war beween the two sides tells you more about human behaviour, than it does about the utility of the technology. Optimists profit from making bold claims to get noticed on the battlefield. Critics are more cynical, furthering their own careers by countering the optimists claims. Other people interpret the interpretations of the cynics second-hand. Thanks to cumulative error, or the Chinese whispers effect, everyone gets really upset. The original optimists vision has been changed in ways they didn’t expect.

It’s a very natural and human story amidst all the “artificial” machine intelligence.

Ora, Jim and Tim have done quite well out of the fighting. Google Scholar reckons their original article has been cited nearly 5000 times. That is a lot of attention, in scientific circles, a veritable blockbuster hit. At the time of writing, not even Albert Einstein can match that, and his ideas are much more important than the semantic web probably ever will be. Many good scientists with important ideas can only dream of publishing a paper that is as heavily cited as that infamous Scientific American article. So which do you think would most scientists prefer:

  • Being internationally known and talked about, but misunderstood by large groups of people?
  • Being relatively unknown, ignored but well understood by a small and obscure group of people?

Neither is ideal but I think in most cases, there is only one thing in the world worse than being talked about, and that is not being talked about.

We have reached the end of chapter 1 of this little story. Wouldn’t it be nice if Chapter 2 was less bloody? Perhaps the two sides could focus more on facts and evidence, rather than the beliefs, opinions, marketing, hype and “visions” that have dominated the battle so far. As the winter solstice approaches and the new year beckons, can we give peace, diplomacy and above all SCIENCE a chance?

The Moral of the Story (so far)

The moral of this old war story is simple. Religions of various kinds have been known to make people commit horrendous and completely unreasonable war crimes. Nobody is innocent. So if you don’t like a fight, steer well clear of religious wars.


  1. The “burn” idea comes from Leftfield with John Lydon (1995) Open Up “Burn Hollywood, Burn! Taking down Tinseltown
  2. Thanks to Carole for the idea of using fiction to illustrate science see Carole Goble and Chris Wroe (2005) The Montagues and the Capulets: In fair Genomics, where we lay our scene… Comparative and Functional Genomics 5(8):623-632 DOI:10.1002/cfg.442 seeAlso Shakespearean Genomics: a plague on both your houses)
  3. This post, originally published on nodalpoint
