What exactly is a drug? A project I’m currently working on requires a good solid definition, at the very least comprehensible to humans, and preferably understandable by more intelligent, semantically aware computers too. I would like to be able to take some scientific model and ask questions like, “show (or hide) all the drugs in this model”. Trouble is, the word “drug” is such a heavily overloaded term, with many alternative meanings, that it is practically meaningless. Just when you think you have a definition, you can find a case that breaks it. I’m not just being an anally-retentive pedant, well no more than usual anyway. It turns out to be much harder than you might think to define what a drug is. The term drug depends on all kinds of contextual information, dosage, species, legality, intent, social conventions and so on. Here are some broken definitions, warts and all. As you’ll see, the various definitions of drugs don’t work, they just make you worse. (more…)
June 12, 2008
The drugs don’t work, they just make you worse
May 22, 2008
First ChEBI workshop, Day Two
Some rough and ready notes from day two of the first ChEBI workshop, 20th May 2008. There were two talks, one from Kirill Degtyarenko (European Patent Office) and the other from Janna Hastings (EBI), followed by a discussion.
Kirill Degtyarenko: Good annotation practice for chemical data, ChEBI experience
Kirill’s talk described how to give the most appropriate names, especially since “biologists don’t name things properly, if at all” (!). Systematic (IUPAC) names are usually better than common names except for “the unprounounceables” for example, an antibiotic called (E)-roxithromycin (ChEBI:48935) has the IUPAC name:
(3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one
…which just trips of the tongue (and fits beautifully, without line breaks onto regular computer screens). Fortunately, the curator can draw the chemical (note the wavy bond, unknown stereochemistry), using the curator tools, then the inchi and smiles strings are generated from the drawing. Currently they use something called ACD/Name which can generate PubChem links automatically. As of May 2008 14,000 chebi ids translates to around 11,000 CIDs in PubChem, which is structures only.
(more…)
May 21, 2008
First ChEBI workshop, Day one

Some notes from day one of the first ChEBI workshop, 19th May 2008. There were four talks from Colin Batchelor (Royal Society of Chemistry), Ulrike Witting (EML Research GmbH Hiedelberg), Giles Weaver (Unilever) and Paula de Matos (EBI). Christoph Steinbeck has already written some ChEBI notes, these just add a little more detail. (more…)
May 15, 2008
BBC: Building a Better ChEBI
Chemical Entitites of Biological Interest, ChEBI, is a freely available dictionary [1] of molecular entities, especially small chemical compounds. Like all big dictionaries and ontologies, it has its own unique challenges. Fortunately, those nice people at the EBI are holding a workshop to discuss future developments in ChEBI. In preparation for the workshop, here are some brief notes on how ChEBI could be made better. [Disclaimer: I’m fairly new to ChEBI and “thinking out loud” here, add comments below if I’ve said anything stupid or wrong]
May 9, 2008
I Still Haven’t Found What I’m Googling For
Twenty one years ago this month, in May 1987, Irish rockers U2 released their classic Joshua Tree single, I Still Haven’t Found What I’m Looking For. Those twenty one years have seen incredible technological change: the adoption of desktop computers, mobile phones, the birth of the Web and the widespread use of search engines like Google. So with sincere apologies to Bono, The Edge, Adam and Larry, it’s time we updated the lyrics for the 21st century. So, I give you “I Still Haven’t Found What I’m Googling For” (21st anniversary, 2008 webby edition)… (more…)
May 1, 2008
April 25, 2008
WWW2008: The Great Firewall of China
The seventeenth international World Wide Web conference (WWW2008.org) is currently finishing in Beijing, China. There are some interesting papers this year. Thankfully, the Great Firewall of China doesn’t prevent these papers reaching the rest of the world. It’s One World, One Web (allegedly). Here are some brief highlights from the conference. (more…)
April 14, 2008
Ensemblog: The Ensembl Weblog
The Ensembl Weblog provides news, views and announcements about the Ensembl Genome Browser. The blog has been going for a few years now, but I have only just become aware of it thanks to a recent Ensembl Genome Browser Tutorial by Bert Overduin. Catching up on posts from Ensemblians this year, Ewan Birney wrote a piece about The Gene Love-in last week and Paul Flicek briefly described the 1000 Genomes project back in January. The Ensembl Weblog is fairly low traffic, so if you don’t already read it, it’s worth considering subscribing to the feed.
And it’s good to see more scientists using blogs to communicate. Long may this trend continue!


