February 5, 2010

Classic paper: Montagues and Capulets in Science

Romeo and Juliet by HappyHippoSnacksIn preparation for a joint seminar I’ll be doing with Midori Harris here at the EBI, here’s a classic paper [1,2] on the social problems of building biomedical ontologies. This paper is worth reading (or re-reading) because it makes lots of relevant points about the use and abuse of research and how people misunderstand each other [3]. It’s funny (and available Open Access too) plus how many papers do you read with an abstract written in the style of Big Bard Bill Shakespeare?

ABSTRACT: Two households, both alike in dignity, In fair Genomics, where we lay our scene, (One, comforted by its logic’s rigour, Claims ontology for the realm of pure, The other, with blessed scientist’s vigour, Acts hastily on models that endure), From ancient grudge break to new mutiny, When ‘being’ drives a fly-man to blaspheme. From forth the fatal loins of these two foes, Researchers to unlock the book of life; Whole misadventured piteous overthrows, Can with their work bury their clans’ strife. The fruitful passage of their GO-mark’d love, And the continuance of their studies sage, Which, united, yield ontologies undreamed-of, Is now the hour’s traffic of our stage; The which if you with patient ears attend, What here shall miss, our toil shall strive to mend.

So if you read the paper, you have to ask yourself, are you a Montague or a Capulet?


  1. Carole Goble and Chris Wroe (2004). The Montagues and the Capulets Comparative and Functional Genomics, 5 (8), 623-632 DOI: 10.1002/cfg.442
  2. Carole Goble (2004) The Capulets and Montagues: A plague on both your houses?, SOFG: Standards and Ontologies for Functional Genomics
  3. William Shakespeare (1596) Romeo and Juliet

[Romeo and Juliet picture via Happy Hippo Snacks]

January 11, 2010

Abscisic Acid: Entity of the Month

Sweetgum bud by Martin LaBarHappy New Year from the ChEBI team where release 64 is now available, containing 534,142 total entities, of which 19,645 are annotated entities and 693 were submitted via the ChEBI submission tool. This month’s entity of the month is Abscisic acid.

(+)-Abscisic acid (CHEBI:2365), known commonly just as abscisic acid or ABA, is a ubiquitous isoprenoid plant hormone which is synthesized in the methylerythritol phosphate (MEP) pathway (also known as the non-mevalonate pathway) by cleavage of C40 carotenoids.

First identified and characterised in 1963 by Fredrick Addicott and his associates at the University of California, Davis [1], ABA was originally believed to play a major role in abscission of fruits (hence its early name of ‘abscisin II’). This is now known to be true for only a small number of plants, a wider role being to act as a regulator of plant responses to a variety of environmental stresses such as drought, extremes of temperatures, and high salinity. Such responses include stimulating the closure of stomata, inhibiting shoot growth while not affecting root growth, and inducing seeds to synthesise storage proteins.

Because of its essential function in plant physiology, targeting the ABA signalling pathway holds considerable promise for future applications in agriculture. Now, in a recent issue of Nature, Ning Zheng and his co-worker Laura Sheard from the University of Washington summarise recent converging studies which reveal the details of how ABA transmits its message [2]. In particular, an article by an international team led by Eric Xu of the Van Andel Research Institute describes how their crystallographic work on unbound ABA and ABA bound to some of its receptors, together with extensive biochemical studies from elsewhere, identify a conserved gate–latch–lock mechanism underlying ABA signalling [3].


  1. Ohkuma, K., Lyon, J., Addicott, F., & Smith, O. (1963). Abscisin II, an Abscission-Accelerating Substance from Young Cotton Fruit Science, 142 (3599), 1592-1593 DOI: 10.1126/science.142.3599.1592
  2. Sheard, L., & Zheng, N. (2009). Plant biology: Signal advance for abscisic acid Nature, 462 (7273), 575-576 DOI: 10.1038/462575a
  3. Melcher, K., Ng, L., Zhou, X., Soon, F., Xu, Y., Suino-Powell, K., Park, S., Weiner, J., Fujii, H., Chinnusamy, V., Kovach, A., Li, J., Wang, Y., Li, J., Peterson, F., Jensen, D., Yong, E., Volkman, B., Cutler, S., Zhu, J., & Xu, H. (2009). A gate–latch–lock mechanism for hormone signalling by abscisic acid receptors Nature, 462 (7273), 602-608 DOI: 10.1038/nature08613

[CC-licensed picture of sweetgum bud by Martin Labar]

December 5, 2009

Adrenaline: Entity of the Month

XML Summer School, Oxford, U.K.December’s entity of the month at ChEBI is Adrenaline, for all the adrenaline junkies out there. This accompanies ChEBI release 63, containing 536,978 total entities, of which 19,501 are annotated entities and 678 were submitted via the ChEBI submission tool. Text reproduced below from the ChEBI website:

Adrenaline (CHEBI:33568), also known as epinephrine, is a catecholamine that acts as a hormone and neurotransmitter.

It was first isolated from an extract of the suprarenal (adrenal) gland as its mono-benzoyl derivative by the American biochemist and pharmacologist John Jacob Abel in 1889 [1] who later also crystallised it as a hydrate. The pure compound was produced in 1901 by the Japanese industrial chemist Jokichi Takamine [2] and patented as ‘Adrenalin’. Two chemists, Stolz and Dakin, independently reported the synthesis of the compound in 1904 [3,4].

Adrenaline is a potent ‘fight-or-flight’ hormone, which is produced in stress situations. When produced in the body, it leads to an increase in heart-rate, vasodilation and the supply of both glucose and oxygen to the muscles and the brain, thus preparing the body for rapid action if needed. The increase in glucose supply is achieved through the binding of adrenaline to β-adrenergic receptors in the liver. This triggers the adenylate cyclase pathway, which, in turn, leads to increased glycogenolysis activity. On the other hand, adrenaline suppresses both digestive processes as well as immune responses. As such, it can be used in the treatment of anaphylactic shock [5] as well as for the treatment of cardiac arrest and cardiac disrythmias [6].

The biosynthesis of adrenaline is regulated by the central nervous system. It is ultimately derived from L-tyrosine, which is converted into L-dihydroxyphenylalanine (L-DOPA) by the action of tyrosine 3-monooxygenase (EC Adrenaline is produced through the conversion of L-DOPA into dopamine into noradrenaline into adrenaline itself.


  1. Abel, J.J. (1899) Ueber den blutdruckerregenden Bestandtheil der Nebenniere, das Epinephrin. Z. Physiol. Chem. 18, 318–324.
  2. Takamine, J., (1902) The isolation of the active principle of the suprarenal gland. J. Physiol. 27 (Suppl), xxix–xxx.
  3. Stolz, F. (1904) Ueber Adrenalin und Alkylaminoacetobrenzkatechin. Ber. Dtsch. Chem. Ges. 37, 4149–4154.
  4. Dakin, H.D. (1905) The synthesis of a substance allied to noradrenaline. Proc. Roy. Soc. Lon. Ser. B 76, 491–497.
  5. ANCHOR, J. (2004). Appropriate use of epinephrine in anaphylaxis The American Journal of Emergency Medicine, 22 (6), 488-490 DOI: 10.1016/j.ajem.2004.07.016
  6. Rainer TH, & Robertson CE (1996). Adrenaline, cardiac arrest, and evidence based medicine. Journal of accident & emergency medicine, 13 (4), 234-7 PMID: 8832338

[CC licensed picture of dan wakeham pipe by jeffcapeshop]

November 5, 2009

Artemether: Entity of the Month

ArtemetherNovember’s entity of the month at ChEBI is the antimalarial drug Artemether. This accompanies release 62 of ChEBI, not just yet another incremental release but an increase of more than twentyfold in the number of entities in ChEBI, thanks to merging of data between an updated ChEBI [1] and ChEMBL [2]. ChEBI now (as of release 62) has over 455,000 total entities, compared to just under 19,000 in the previous version (release 61), see ChEBI news for details. The text below on Artemether is reproduced from the ChEBI website, where content is available under a Creative Commons license:

Artemether (CHEBI:195280) is a lipid-soluble antimalarial for the treatment of multi-drug resistant strains of Plasmodium falciparum malaria. First prepared in 1979 [3], it is a methyl ether of the naturally occurring sesquiterpene lactone (+)-artemisinin, which is isolated from the leaves of Artemisia annua L. (sweet wormwood), the traditional Chinese medicinal herb known as Qinghao. However, because of artemether’s extremely rapid mode of action (it has an elimination half-life of only 2 hours, being metabolized to dihydroartemisinin which then undergoes rapid clearance), it is used in combination with other, longer-acting, drugs. One such combination, licensed in April of this year by the WHO, is Coartem in which the artemether is mixed with lumefantrine – a racemic mixture of a synthetic fluorene derivative known formerly as benflumetol – which has a much longer and pharmacologically complementary terminal half-life of 3–6 days, allowing the two drugs to act synergistically against Plasmodium.

The molecule of artemether is interesting because of its extreme rigidity, with very few rotational bonds. Unlike quinine class antimalarial drugs, it has no nitrogen atom in its skeleton. However, an important chemical feature (and unique in drugs) is the presence of an O–O endoperoxide bridge which is essential for its antimalarial activity, as it is this bridge which is split in an interaction with heme, blocking the conversion into hemozoin and thus releasing into the parasite heme and a host of free radicals which attack the cell membrane.

Artemether is fully Rule-of-Five compliant and has recently also been under investigation as a possible candidate for cancer treatment [4,5].



  1. de Matos, P., Alcantara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., & Steinbeck, C. (2009). Chemical Entities of Biological Interest: an update Nucleic Acids Research DOI: 10.1093/nar/gkp886
  2. Warr, W. (2009). ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI) Journal of Computer-Aided Molecular Design, 23 (4), 195-198 DOI: 10.1007/s10822-009-9260-9
  3. Li, Y. et al. (1979) K’o Hsueh T’ung Pao, 24, 667 [Chem. Abstr., 91, 211376u].
  4. Singh, N., & Panwar, V. (2006). Case Report of a Pituitary Macroadenoma Treated With Artemether Integrative Cancer Therapies, 5 (4), 391-394 DOI: 10.1177/1534735406295311
  5. Wu, Z., Gao, C., Wu, Y., Zhu, Q., Yan Chen, ., Xin Liu, ., & Chuen Liu, . (2009). Inhibitive Effect of Artemether on Tumor Growth and Angiogenesis in the Rat C6 Orthotopic Brain Gliomas Model Integrative Cancer Therapies, 8 (1), 88-92 DOI: 10.1177/1534735408330714

June 4, 2009

Improving the OBO Foundry Principles

The Old Smithy Pub by loop ohThe Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

  1. The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
  2. The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
  3. The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
  4. The ontology provider has procedures for identifying distinct successive versions.
  5. The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
  6. The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
  7. The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
  8. The ontology is well documented.
  9. The ontology has a plurality of independent users.
  10. The ontology will be developed collaboratively with other OBO Foundry members.

ResearchBlogging.orgI’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

CC-licensed picture above of the Old Smithy (pub) by Loop Oh. Inspired by Michael Ashburner‘s standing OBO joke (Ontolojoke) which goes something like this: Because Barry Smith is one of the leaders of OBO, should the project be called the OBO Smithy or the OBO Foundry? 🙂


  1. Noy, N., Shah, N., Whetzel, P., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D., Storey, M., Chute, C., & Musen, M. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse Nucleic Acids Research DOI: 10.1093/nar/gkp440
  2. Côté, R., Jones, P., Apweiler, R., & Hermjakob, H. (2006). The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries BMC Bioinformatics, 7 (1) DOI: 10.1186/1471-2105-7-97
  3. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346
  4. Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., & Rosse, C. (2005). Relations in biomedical ontologies Genome Biology, 6 (5) DOI: 10.1186/gb-2005-6-5-r46
  5. Bada, M., & Hunter, L. (2008). Identification of OBO nonalignments and its implications for OBO enrichment Bioinformatics, 24 (12), 1448-1455 DOI: 10.1093/bioinformatics/btn194

May 6, 2009

Michel Dumontier on Representing Biochemistry

Michel Dumontier by Tom HeathMichel Dumontier is visiting Manchester this week, he will be doing a seminar on Monday 11th of May,  here are some details for anyone who is interested in attending:

Title: Increasingly Accurate Representation of Biochemistry

Speaker: Michel Dumontier, dumontierlab.com

Time: 14.00, Monday 11th May 2009
Venue: Atlas 1, Kilburn Building, University of Manchester, number 39 on the Google Campus Map

Abstract: Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate manner. A fundamental starting point is biochemical identity, but our current approach for generating identifiers is haphazard and consequently integrating data is error-prone. I will discuss plausible structure-based strategies for biochemical identity whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups) such that identifiers may be generated in an automatic and curator/database independent manner. With structure-based identifiers in hand, we will be in a position to more accurately capture context-specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, our current representation of biochemical knowledge may improve such that manual and automatic methods of biocuration are substantially more accurate.

Update: Slides are now available via SlideShare.

[Creative Commons licensed picture of Michel in action at ISWC 2008 from Tom Heath]


  1. Michel Dumontier and Natalia Villanueva-Rosales (2009) Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics DOI:10.1093/bib/bbn056
  2. Doug Howe et al (2008) Big data: The future of biocuration Nature 455, 47-50 doi:10.1038/455047a

April 17, 2009

The Unreasonable Effectiveness of Google

GoogleVia the Official Google Research Blog at the University of Google, Alon Halevy, Peter Norvig and Fernando Pereira have published an interesting expert opinion piece in the  March/April 2009 edition of IEEE Intelligent Systems: computer.org/intelligent. The paper talks about embracing complexity and making use of the “the unreasonable effectiveness of data” [1] drawing analogies with the “unreasonable effectiveness of mathematics” [2]. There is plenty to agree and disagree with in this provocative article which makes it an entertaining read. So what can we learn from those expert Googlers in the Googleplex? (more…)

March 16, 2009

October 27, 2008

OWL Experiences and Directions (OWLED) 2008

Great Grey Owl by Brian ScottThe Web Ontology Language (OWL) is a language for creating ontologies on the Web. It does exactly what it says on the tin. But what is an ontology? One way to think of it is as a better way of storing data and knowledge. Instead of just capturing and describing data in a databases, ontology languages like OWL provide ways to capture and describe knowledge in a knowledge base. Ontologies can allow more intelligent querying, integration and understanding of data than is possible using a plain old relational database.

Since 2003 developers and users of the Web Ontology Language, abbreviated to OWL (not WOL), have been gathering at a two-day workshop called OWLED (OWL Experiences and Directions). This year the workshop is in Karlsruhe in Germany. The full list of accepted papers is available, as with previous years, this years workshop has a distinctly biological flavour to the proceedings: (more…)

July 15, 2008

ChEBI, Oh ChEBI, Oh Baby!

Filed under: informatics — Duncan Hull @ 2:30 pm
Tags: , , , , , ,

cherry oh baby With sincere apologies to Jamaican reggae singer-songwriter Eric Donaldson, “ChEBI, Oh ChEBI, Oh Baby, don’t you know I’m in need of thee”?

Chemical Entities of Biological Interest (ChEBI) is a dictionary, controlled vocabulary, database, ontology of small (low molecular-weight) chemical entities that are considered to be biologically interesting, (like amphetamine (CHEBI:2679) for example). After a couple of recent meetings, ChEBI is going through some serious revision, to make it more efficient to maintain and use. Here are some brief notes on the changes, for my own benefit mostly and to collate links, but perhaps others are interested too.

Janna Hastings has created a wiki for the New ChEBI Ontology project which includes links to the ChEBI ontology remediation notes and meeting notes from 9th July 2008.

Next Page »

Blog at WordPress.com.