June 3, 2010

The smell of baking and toasting bread: Entity of the Month

Filed under: ChEBI — Duncan Hull @ 7:47 am
Tags: , , , ,

ToastRelease 69 of Chemical Entities of Biological Interest (ChEBI) is now available, with 584,456 total entities, of which 21,369 are fully annotated to three star level. This months Entity of the Month is the smell of bread (baked and toasted), or more precisely 6-acetyl-2,3,4,5-tetrahydropyridine. The text below is reproduced from the ChEBI website where data is available under a Creative Commons license.

Chemistry, like most other fields of human endeavour, has a tremendous capacity for both good and evil. However, arguably one of the best and most delightful reactions in chemistry is the Maillard reaction.

It occurs when amino acids are heated together with sugar and is therefore a prominent reaction when baking bread or brewing beer: many of the reaction products provide the characteristic flavours of these foods, which we all enjoy so much.While the chemical structures and identities of most of the products of this form of “non-enzymatic browning” are only poorly characterised or unknown, our Entity of the Month 6-acetyl-2,3,4,5-tetrahydropyridine (CHEBI:59533) is an exception. It is a well known aromatic compound, which is responsible for the flavour of white bread, popcorn and tortillas and has an extremely low odour threshold, between 0.02 and 0.06 ng l–1[1]. It exists in a tautomeric equilibrium with 6-acetyl-1,2,3,4-tetrahydropyridine, the two forms usually occurring in foods in a 1:2 ratio.

The compound can be synthesized in a simple three-step procedure. In a first step, BOC-protected 2-piperidone is treated with 1-ethoxy-1-lithioethene in a bid to build up the acetyl side-chain. This results in ring opening and the formation of a linear ketone which, after treatment with toluene-p-sulfonic acid, reforms the heterocycle in the form of an ene-carbamate. Treatment of the latter with potassium hydroxide yields the final product [1].

The Maillard Reaction is named after Louis Camille Maillard, a precocious French physiologist, who first described it in the 1910s. Maillard is also known for his contributions towards the diagnosis of kidney disorders.

The image top right shows freshly toasted bread – and the brown colour (the Maillard reaction is a method for non-enzymatic browning) is indicative of the reaction having taken place and is taken from the Wellcome Trust Image Collection


  1. Harrison, T., & Dake, G. (2005). An Expeditious, High-Yielding Construction of the Food Aroma Compounds 6-Acetyl-1,2,3,4-tetrahydropyridine and 2-Acetyl-1-pyrroline The Journal of Organic Chemistry, 70 (26), 10872-10874 DOI: 10.1021/jo051940a

May 27, 2010

The 2nd ChEBI workshop: Call for Participation

The NanoPutians:  Synthesis of Anthropomorphic MoleculesThe second Chemical Entities of Biological Interest (ChEBI) workshop will be held at the European Bioinformatics Institute (EBI) in Hinxton, Cambridgeshire, UK on the 23rd and 24th June 2010. The full provisional schedule (including registration page) for this workshop is now available. Speakers at the workshop include:

There will also be several discussion sessions on the future evolution of the ChEBI project. Training will be provided including using ChEBI for research purposes and submitting your chemicals to ChEBI for annotation. We (the ChEBI team) hope to welcome you to Hinxton in June.


  1. Image of The NanoPutians taken from: Chanteau, S., & Tour, J. (2003). Synthesis of Anthropomorphic Molecules: The NanoPutians The Journal of Organic Chemistry, 68 (23), 8750-8766 DOI: 10.1021/jo0349227

May 6, 2010

Mephedrone: Entity of the Month

Overdosed by Elad R, on FlickrRelease 68 of Chemical Entities of Biological Interest (ChEBI) is now available, with 549,319 total entities, of which 21,075 are fully annotated. This month’s entity of the month is Mephedrone, a substance which has been in the news headlines lately and as wikipedia points out is “not to be confused with Methedrine, Methedrone, Methadone, or Methylone“. Don’t you just love chemical names?! Text below reproduced from the ChEBI website:

“Mephedrone (CHEBI:59331) is a synthetic central nervous system stimulant and entactogen drug chemically related to cathinone, the psychoactive alkaloid present in the khat plant (Catha edulis, family Celastraceae).

It can be synthesised from 4-methylpropiophenone by an initial bromination at the β-carbon followed by replacement of the bromine by a methylamino group derived from methylamine hydrochloride. Although it was probably not available until 2007, by 2009 mephedrone had become the fourth most popular street drug in the UK, behind cannabis, cocaine and ecstasy. Little is currently known regarding its pharmacology or toxicology, although one recent report suggests the likelihood that it stimulates the release of, and then inhibits the reuptake of, monoamine neurotransmitters [1].

Although already listed as a prohibited substance in many countries, in others it has varying degrees of legality (notably the USA where it is currently unscheduled under the Controlled Substances Act). In the UK, a decision by the Home Secretary to classify mephedrone as illegal caused the resignation of two members of the Advisory Council on the Misuse of Drugs (ACMD) which has led in turn to a general questioning of UK drugs policy. Mephedrone finally became classified as a Class B drug in the UK on April 16, 2010 – prior to this time it was often sold openly under the guise of a ‘plant food’ (although having no known use as such).”


  1. Winstock, A., Marsden, J., & Mitcheson, L. (2010). What should be done about mephedrone? BMJ, 340:c1605 DOI: 10.1136/bmj.c1605

[Creative Commons licensed picture ‘Overdosed’ by Elad Rahmin on flickr]

April 8, 2010

8-OHdG: Entity of the Month

DNA Origami by Alex BatemanChemical Entities of Biological Interest (ChEBI) release 67 is now available, containing 548,850 total entities, of which 20,565 are annotated entities and 720 were submitted via the ChEBI submission tool. New in this release, the ChEBI ontology is now available in the Web Ontology Language (OWL), which is part of an ongoing research project to automate the classification of small molecules in ChEBI. If you’re using this data, we’d like to hear from you! This month’s entity of the month is 8-OHdG. Text below reproduced from ChEBI website:

8-Hydroxy-2′-deoxyguanosine (8-OHdG, ChEBI:40304) is an important molecule in oxidative stress used as a biomarker of many processes involving reactive oxygen species. Also known as 8-oxo-dG (this abbreviation derived from its tautomeric name 8-oxo-7,8-dihydro-2′-deoxyguanosine) and as HMDB03333 in the Human Metabolome Database [1], it has been used especially as a sensitive marker of the DNA damage caused by hydroxyl radical attack at C-8 of guanine. This damage, if left unrepaired, has been proposed to contribute to mutagenicity and cancer promotion [2]. This use of 8-OHdG as a biomarker for DNA damage extends over a wide range of scenarios [3,4,5,6], because it is one of the major products of DNA oxidation.

More recent work by Junko Fujihara and his colleagues at Shimane University in Japan has demonstrated how 8-OHdG can be used as a possible marker for arsenic poisoning, since antiquity a method of dispatch frequent in homicide and suicide cases [7]. Fujihara’s study however focuses principally on the use of arsenic in medicine, and specifically in demonstrating a relationship between concentrations of 8-OHdG and various arsenic compounds in the urine of a patient with acute promyelocytic leukaemia being treated with arsenic trioxide. Their conclusions that 8-OHdG in urine can be used therapeutically as a key biomarker for arsenic compounds may also find application in the diagnosis of arsenic poisoning when arising from the consumption of seafood such as fish, shrimp, oysters and seaweeds, organisms known to contain appreciable amounts of arsenic compounds.

[Picture of Alex Bateman‘s DNA origami in action from The Wellcome Trust Sanger Institute.]


  1. Wishart, D., Knox, C., Guo, A., Eisner, R., Young, N., Gautam, B., Hau, D., Psychogios, N., Dong, E., Bouatra, S., Mandal, R., Sinelnikov, I., Xia, J., Jia, L., Cruz, J., Lim, E., Sobsey, C., Shrivastava, S., Huang, P., Liu, P., Fang, L., Peng, J., Fradette, R., Cheng, D., Tzur, D., Clements, M., Lewis, A., De Souza, A., Zuniga, A., Dawe, M., Xiong, Y., Clive, D., Greiner, R., Nazyrova, A., Shaykhutdinov, R., Li, L., Vogel, H., & Forsythe, I. (2009). HMDB: a knowledgebase for the human metabolome Nucleic Acids Research, 37 (Database) DOI: 10.1093/nar/gkn810
  2. Kuchino, Y., Mori, F., Kasai, H., Inoue, H., Iwai, S., Miura, K., Ohtsuka, E., & Nishimura, S. (1987). Misreading of DNA templates containing 8-hydroxydeoxyguanosine at the modified base and at adjacent residues Nature, 327 (6117), 77-79 DOI: 10.1038/327077a0
  3. Wu LL, Chiou CC, Chang PY, & Wu JT (2004). Urinary 8-OHdG: a marker of oxidative stress to DNA and a risk factor for cancer, atherosclerosis and diabetics. Clinica chimica acta; international journal of clinical chemistry, 339 (1-2), 1-9 PMID: 14687888
  4. Schriner, S. (2005). Extension of Murine Life Span by Overexpression of Catalase Targeted to Mitochondria Science, 308 (5730), 1909-1911 DOI: 10.1126/science.1106653
  5. Sumida S, Doi T, Sakurai M, Yoshioka Y, & Okamura K (1997). Effect of a single bout of exercise and beta-carotene supplementation on the urinary excretion of 8-hydroxy-deoxyguanosine in humans. Free radical research, 27 (6), 607-18 PMID: 9455696
  6. Tarng DC, Huang TP, Wei YH, Liu TY, Chen HW, Wen Chen T, & Yang WC (2000). 8-hydroxy-2′-deoxyguanosine of leukocyte DNA as a marker of oxidative stress in chronic hemodialysis patients. American journal of kidney diseases : the official journal of the National Kidney Foundation, 36 (5), 934-44 PMID: 11054349
  7. Fujihara, J., Agusa, T., Tanaka, J., Fujii, Y., Moritani, T., Hasegawa, M., Iwata, H., Tanabe, S., & Takeshita, H. (2009). 8-Hydroxy-2′-deoxyguanosine (8-OHdG) as a possible marker of arsenic poisoning: a clinical case study on the relationship between concentrations of 8-OHdG and each arsenic compound in urine of an acute promyelocytic leukemia patient being treated with a Forensic Toxicology, 27 (1), 41-44 DOI: 10.1007/s11419-008-0062-x

March 4, 2010

Sildenafil citrate: Entity of the Month

30 St Mary Axe or the Gherkin - London by Patrick MayonRelease 66 of Chemical Entities of Biological Interest (ChEBI) is now available, containing 534,521 total entities, of which 20,151 are annotated entities and 698 were submitted via the ChEBI submission tool. This months entity of the month is Viagra, also known as Sildenafil citrate: (Text below reproduced from ChEBI website)

Few chemical compounds are better known to the general public than sildenafil citrate (CHEBI:58987), traded under the name of “Viagra”.The compound was first synthesised by chemists working at Pfizer, with a view to using it for the treatment of hypertension and angina pectoris. Whilst having been found to be ineffective against angina in clinical trials, it has been observed to induce penile erections and was therefore marketed by Pfizer as a drug for the treatment of erectile dysfunction.

A number of synthetic routes for the preparation of the parent sildenafil have been reported [1]. A common industrial synthetic route is through reaction of 4-amino-1-methyl-3-N-propylpyrazole-5-carboxamide and 2-ethoxy-5-(4-methylpiperazin-1-yl)sulfonylbenzoic acid followed by subsequent cyclisation to sildenafil through heating under acidic conditions.

Sildenafil has been shown to be an inhibitor of cyclic guanosine monophosphate specific phosphodiesterase type 5, an enzyme which is responsible for the degradation of 3′,5′-cyclic GMP (cyclic guanosine monophosphate, cGMP) in the corpus cavernosum. This leads to the presence of increased levels of cGMP, which, in turn causes vasodilation of the helicine arteries and thus increased blood flow into the spongy tissue of the penis [2].

Apart from the treatment of sexual dysfunction, sildenafil is also used in the treatment of pulmonary arterial hypertension and works again through relaxation of the arterial wall, which leads to a decrease in arterial resistance [3]. Furthermore – and arguably most interestingly – sildenafil has been found to decrease the time necessary for the re-entrainment of circadian rhythms after phase advances in the light–dark cycle (such as occur on transmeridian eastbound flights) in members of the Cricetidae family [4]*. The discovery was rewarded with the award of an Ig Nobel Prize in Aviation in 2007.

* or as wikipedia puts it…”Viagra aids jet lag recovery in hamsters” …that’s an interesting side effect.


  1. Dunn, P. (2005). Synthesis of Commercial Phosphodiesterase(V) Inhibitors Organic Process Research & Development, 9 (1), 88-97 DOI: 10.1021/op040019c
  2. Webb DJ, Freestone S, Allen MJ, & Muirhead GJ (1999). Sildenafil citrate and blood-pressure-lowering drugs: results of drug interaction studies with an organic nitrate and a calcium antagonist. The American journal of cardiology, 83 (5A) PMID: 10078539
  3. Richalet, J. (2004). Sildenafil Inhibits Altitude-induced Hypoxemia and Pulmonary Hypertension American Journal of Respiratory and Critical Care Medicine, 171 (3), 275-281 DOI: 10.1164/rccm.200406-804OC
  4. Agostino, P., Plano, S., & Golombek, D. (2007). Sildenafil accelerates reentrainment of circadian rhythms after advancing light schedules Proceedings of the National Academy of Sciences, 104 (23), 9834-9839 DOI: 10.1073/pnas.0703388104

[Creative Commons licensed picture of 30 St Mary Axe or the Gherkin – London by Patrick Mayon, see comments on this post at friendfeed]

February 25, 2010

Apache Maven: A Misbehavin’ Build Tool?

Filed under: ChEBI,programming,technology — Duncan Hull @ 11:00 am
Tags: , , , , , ,

Chocolate Tools by JanneMOne of the many tools we use in our team to manage the development of the ChEBI software is an automated build tool called Apache Maven. Opinions are often divided on whether Maven is a good or a bad thing. Most of them are very subjective, argumentative and often very extended. See why does Maven have such a bad reputation? and 25 things* I hate about Maven for examples.

All this is fairly predictable, and I could add a few tales of Maven woe to the pile myself. But wondering if Maven is any good reminded me of something Bjarne Stroustrup [1,2,3] (one of the people behind the C++ programming language) once said in an article on the problem with programming:

“There are just two kinds of [programming] languages: the ones everybody complains about and the ones nobody uses.”

Actually when you think about it this applies to build systems too, there are two kinds. It also applies to just about any technology you care to name, you can crudely classify them all into two categories:

  1. Those technologies everybody complains about…
  2. … and the rest, that nobody uses.

So is Maven any good? Worth using? Worth the pain? Depends on who you ask. What we can say for sure, is that like many technologies, everybody complains about it.


  1. Bjarne Stroustrup (2010). Viewpoint: What should we teach new software developers? Why? Communications of the ACM, 53 (1) DOI: 10.1145/1629175.1629192
  2. Bjarne Stroustrup (2007). Evolving a language in and for the real world: C++ 1991-2006 Proceedings of the third ACM SIGPLAN conference on History of programming languages DOI: 10.1145/1238844.1238848
  3. Bjarne Stroustrup (1993). A history of C++: 1979–1991 The second ACM SIGPLAN conference on History of programming languages DOI: 10.1145/154766.155375

* Only 25? That seems like quite a short list to me.

[CC-licensed Chocolate Tools image by JanneM, some commentary on this post over at friendfeed.]

February 12, 2010

The 3rd OBO Foundry Workshop 2010, Cambridge, UK

Ultrawide Wellcome Trust Genome Campus, Cambridge by Tim NugentThe Open Biomedical Ontologies (OBO) [1] are a set of reference ontologies for describing all kinds of biomedical data shared in a centralised repository called The OBO Foundry. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop two years ago, the second workshop last year it’s already time for the third workshop on February 15th-16th. All the details and agenda are here if you’re interested. This workshop is possible thanks to sponsorship from the BBSRC funds for Workshop on Data Standards and the EU ELIXIR ‘Data Integration & Interoperability’ Package 7.

[Update: outcomes from the workshop are available here, along with a summary of discussion from Monday and a summary of discussion from Tuesday.]


  1. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346

[Ultrawide panoramic picture of the Wellcome Trust Genome Campus by Tim Nugent, as featured on the cover of the EMBL-EBI Annual Scientific Report 2009. Making those pictures looks like a lot of fun.]

January 21, 2010

Blogging a Book about Bio-Ontologies

Waterloo Station Ultrawide Panoramic by Tim NugentIf you wanted to write a guide to Biomedical and Biological Ontologies [1], especially the what, why, when, how, where and who, there are at least three choices for publishing your work:

  1. Journal publishing in your favourite scientific journal.
  2. Book publishing with your favourite academic or technical publisher.
  3. Self publishing on a web blog with your favourite blogging software.

Each of these has its own unique problems:

  • The trouble with journals is that they typically don’t publish “how to” guides, although you might be able to publish some kind of review.
  • The trouble with books, and academic books in particular, is that people (and machines) often don’t read them. Also, academic books can be prohibitively expensive to buy and this can make the data inside them less visible and accessible to the widest audience. Unfortunately all that lovely knowledge gets locked up behind publishers paywalls. To add insult to injury, most academic books take a very long time to publish, often several years. By the time of printing, the content of many academic books is often very dated.
  • The trouble with blogs, they aren’t peer-reviewed in the traditional way and they tend to be written by a single person from a not very neutral point of view. Or as Dave once put it “vanity publishing for arrogant people with an inflated ego“. Ouch.

So the people behind the Ontogenesis network (Robert Stevens and Phillip Lord with funding from the EPSRC grant ref: EP/E021352/1) had an idea. Why not blog a book about Ontology? As a publishing experiment – it might just work by combining the merits of books and blogs together in order to overcome their shortcomings. This will involve getting a small group of about twenty people (mostly bio-ontologists) together, and writing about what an ontology is, why you would want to a biomedical ontology, how to build one and so on. We will be doing some of the peer-review online too.

As part of an ongoing experiment, we are posting all this information on a blog called http://ontogenesis.knowledgeblog.org if you’d like to follow, subscribe to the feed and read the manifesto.


  1. Yu, A. (2006). Methods in biomedical ontology Journal of Biomedical Informatics, 39 (3), 252-266 DOI: 10.1016/j.jbi.2005.11.006

[Ultrawide panoramic picture of Waterloo station by Tim Nugent]

January 15, 2010

Bio2RDF: Large Scale, Distributed Biological Knowledge Discovery

Filed under: ChEBI — Duncan Hull @ 2:11 pm
Tags: , , , , , , ,

Bio2RDFMichel Dumontier was visiting the EBI this week, here’s the details of his seminar Bio2RDF and Beyond! Large Scale, Distributed Biological Knowledge Discovery (slides embedded below) for anyone interested who missed it:

Abstract: The Bio2RDF.org [1] project aims to transform silos of bioinformatics data into a distributed platform for biological knowledge discovery. Initial work focused on building a public database of open-linked data with web-resolvable identifiers that provides information about named entities. This involved a syntactic normalization to convert open data represented in a variety of formats (flatfile, tab, xml, web services) to RDF-based linked data with normalized names (HTTP URIs) and basic typing from source databases. Bio2RDF entities also make reference to other open linked data networks (e.g. dbPedia) thus facilitating traversal across information spaces. However, a significant problem arises when attempting to undertake more sophisticated knowledge discovery approaches such as question answering or symbolic data mining. This is because knowledge is represented in a fundamentally different manner, requiring one to know the underlying data model and reconcile the artefactual differences when they arise. In this talk, we describe our data integration strategy that makes use of both syntactic and semantic normalization to consistently marshal knowledge to a common data model while leveraging explicit logic-based mappings with community ontologies to further enhance the biological knowledgescope. Coupled with the web-service based Semantic Automated Discovery and Integration (SADI) framework, Bio2RDF is well placed to serve up biological data for prediction and analysis.

Some quick notes: Bio2RDF is currently indexing around 5 billion triples, and is built with the open source Virtuoso database. There are some scalability issues in making the system cope with up to a total of 15+ billion triples currently required. There is nothing in Bio2RDF yet that deals with the redundancy problem, e.g. “buggotea” and its friends.


  1. Belleau, F., Nolin, M., Tourigny, N., Rigault, P., & Morissette, J. (2008). Bio2RDF: Towards a mashup to build bioinformatics knowledge systems Journal of Biomedical Informatics, 41 (5), 706-716 DOI: 10.1016/j.jbi.2008.03.004

January 11, 2010

Abscisic Acid: Entity of the Month

Sweetgum bud by Martin LaBarHappy New Year from the ChEBI team where release 64 is now available, containing 534,142 total entities, of which 19,645 are annotated entities and 693 were submitted via the ChEBI submission tool. This month’s entity of the month is Abscisic acid.

(+)-Abscisic acid (CHEBI:2365), known commonly just as abscisic acid or ABA, is a ubiquitous isoprenoid plant hormone which is synthesized in the methylerythritol phosphate (MEP) pathway (also known as the non-mevalonate pathway) by cleavage of C40 carotenoids.

First identified and characterised in 1963 by Fredrick Addicott and his associates at the University of California, Davis [1], ABA was originally believed to play a major role in abscission of fruits (hence its early name of ‘abscisin II’). This is now known to be true for only a small number of plants, a wider role being to act as a regulator of plant responses to a variety of environmental stresses such as drought, extremes of temperatures, and high salinity. Such responses include stimulating the closure of stomata, inhibiting shoot growth while not affecting root growth, and inducing seeds to synthesise storage proteins.

Because of its essential function in plant physiology, targeting the ABA signalling pathway holds considerable promise for future applications in agriculture. Now, in a recent issue of Nature, Ning Zheng and his co-worker Laura Sheard from the University of Washington summarise recent converging studies which reveal the details of how ABA transmits its message [2]. In particular, an article by an international team led by Eric Xu of the Van Andel Research Institute describes how their crystallographic work on unbound ABA and ABA bound to some of its receptors, together with extensive biochemical studies from elsewhere, identify a conserved gate–latch–lock mechanism underlying ABA signalling [3].


  1. Ohkuma, K., Lyon, J., Addicott, F., & Smith, O. (1963). Abscisin II, an Abscission-Accelerating Substance from Young Cotton Fruit Science, 142 (3599), 1592-1593 DOI: 10.1126/science.142.3599.1592
  2. Sheard, L., & Zheng, N. (2009). Plant biology: Signal advance for abscisic acid Nature, 462 (7273), 575-576 DOI: 10.1038/462575a
  3. Melcher, K., Ng, L., Zhou, X., Soon, F., Xu, Y., Suino-Powell, K., Park, S., Weiner, J., Fujii, H., Chinnusamy, V., Kovach, A., Li, J., Wang, Y., Li, J., Peterson, F., Jensen, D., Yong, E., Volkman, B., Cutler, S., Zhu, J., & Xu, H. (2009). A gate–latch–lock mechanism for hormone signalling by abscisic acid receptors Nature, 462 (7273), 602-608 DOI: 10.1038/nature08613

[CC-licensed picture of sweetgum bud by Martin Labar]

Blog at WordPress.com.