The 2nd ChEBI workshop: Call for Participation

The NanoPutians:  Synthesis of Anthropomorphic MoleculesThe second Chemical Entities of Biological Interest (ChEBI) workshop will be held at the European Bioinformatics Institute (EBI) in Hinxton, Cambridgeshire, UK on the 23rd and 24th June 2010. The full provisional schedule (including registration page) for this workshop is now available. Speakers at the workshop include:

There will also be several discussion sessions on the future evolution of the ChEBI project. Training will be provided including using ChEBI for research purposes and submitting your chemicals to ChEBI for annotation. We (the ChEBI team) hope to welcome you to Hinxton in June.


DNA, Diversity and You at Cambridge Science Festival

Sequence BraceletsAs part of Cambridge Science festival last weekend, I joined a group of about 40 volunteers from The Sanger and EBI at an event “DNA, diversity and you”. This was a series of education and outreach events designed to explore how differences in your genetic code make you different from other individuals, and what makes the humans different from other living things –  with a bit of computational biology thrown in for good measure.  Here are some notes on a selection of the activities, in case you ever find yourself trying to explain biology, computer science or bioinformatics to anyone aged 4-18 and beyond. These resources are all tried, tested and fun to work with, for students and teachers alike:

  1. DNA origami create your own origami DNA molecule, and hands on way of learning abou tthe double helix structure of DNA
  2. DNA sequence bracelets (see picture right). Thread coloured beads according to sequence sections from a range of organisms including trout, chimpanzee, butterfly, a flesh-eating microbe and rotting corpse flower.
  3. Yummy gummy DNA (under 5’s) build your own DNA helix out of sweets and cocktail sticks. Then scoff it all afterwards.
  4. What’s my name in DNA? find out what your name is in DNA, and what the corresponding (hypothetical) protein is using software from deCODE.
  5. Function Finders translate DNA into a sequence of amino acids using wooden translator blocks, then find out which organism the amino acid sequence is from.
  6. Genome sizes (with seatbelts) Rank organisms (inc. human, zebrafish, mosquito, sugar cane and yeast) and find out if they are in the right order. Results are often not what you would expect.
  7. Play your genes right. A card-based guessing game which compares the number of genes in the human genome with the number of genes from a range of different organisms include the flu virus, E. coli bacteria, armadillo, rice plant and others.
  8. Genome Jigsaws for illustrating the process of finishing supposedly “finished” genomes, by putting together a square sequence jigsaw following base pairing rules to end up with a complete finished square.
  9. DNA Time Team examines of aspects ancestry and evolution. The activity encourages people to work out the sequence of a common ancestor by filling in the gaps on a simple evolutionary tree.
  10. Spot the difference with proteins. Comparing Heat Shock Protein (HSP) in human and other organisms to illustrate how different regions of the protein vary between different organisms and how this affects function.
  11. Ready, steady sort: a sorting network that demonstrates one technique that computers use to sort through large amounts of information like sequence data. This comes straight from Computer Science Unplugged by Tim Bell, Mike Fellows and Ian Witten. This activity can be done either as a smaller board game, or as a larger floor game. Either way, it’s a lot of fun, especially if you time people for an added competitive element (see video below)

There were a whole bunch of new activities at the festival this year, maybe these will appear on the your genome website in the future. Anyway, it was great fun to get involved, there is nothing quite like the challenge of explaining parallel computing to young kids, teenagers and their parents – actually much easier than you’d think if you’ve got access to great teaching materials.

Thanks to Francesca Gale and Louisa Wright for all the hard work that went into organising this fun and successful event.

OBO Foundry workshop outcomes 2009

Haystack OWL by dullhunkWell I was going to blog about last weeks Open Biomedical Ontologies workshop, but Susanna-Assunta Sansone at the EBI has already done it via some very detailed minutes. See her notes for the:

  1. Overview
  2. Outcomes from day one
  3. Outcomes from day two

Thanks to the organisers of this workshop for hosting another well run event, I’m only sorry I had to miss the delicious looking dinner at Cotto in Cambridge (and entertaining company) on the last day…  Hope to see you again next year.


Improving the OBO Foundry Principles

The Old Smithy Pub by loop ohThe Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

  1. The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
  2. The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
  3. The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
  4. The ontology provider has procedures for identifying distinct successive versions.
  5. The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
  6. The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
  7. The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
  8. The ontology is well documented.
  9. The ontology has a plurality of independent users.
  10. The ontology will be developed collaboratively with other OBO Foundry members.

ResearchBlogging.orgI’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

Embracing Registries of Web Services

Embracing by tanakwhoIf you travel back in time, to around 2002, it isn’t difficult to find people claiming that Web services were going to be the new silver bullet technology to create world peace, eradicate global poverty and finally make some sense of all the data produced by the human genome project. Over hyped? Just a bit. One of the many reasons none of these things happened, is it turned out to be much harder than anticipated to build centralised registries, where people could go to find Web services to perform a given task. Can service registries ever be built? Critics like Tim Bray at Sun Microsystems for example, have suggested that (quote) “registries are a fantasy”, but some already exist and there are more in the pipeline. This article briefly introduces some of them: Seekda, BioMOBY, the Embrace service registry and the Biocatalogue project. (more…)

PhD studentships at EMBL-EBI, UK

EMBL-EBIAny budding biomedical scientists out there, interested in doing a PhD, take note: The European Molecular Biology Laboratory (EMBL) – European Bioinformatics Institute (EBI) is having an open day on Monday 3rd November 2008. According to their website the EBI is “happy to welcome all Master students to this day”. Some talks at this open day include:

The EMBL-EBI lies in the 55 acres of landscaped parkland in rural Cambridgeshire that make up the Wellcome Trust Genome Campus. The Campus also houses the Wellcome Trust Sanger Institute, making it one of the world’s largest concentrations of expertise in genomics and bioinformatics. See also PhD Studies in Bioinformatics at the EBI. If you are interested in attending, sign up at the registration page before the 20th October.

See also PhD Opportunities at the Wellcome Trust Sanger Institute, Cambridge.

ChEBI, Oh ChEBI, Oh Baby!

cherry oh baby With sincere apologies to Jamaican reggae singer-songwriter Eric Donaldson, “ChEBI, Oh ChEBI, Oh Baby, don’t you know I’m in need of thee”?

Chemical Entities of Biological Interest (ChEBI) is a dictionary, controlled vocabulary, database, ontology of small (low molecular-weight) chemical entities that are considered to be biologically interesting, (like amphetamine (CHEBI:2679) for example). After a couple of recent meetings, ChEBI is going through some serious revision, to make it more efficient to maintain and use. Here are some brief notes on the changes, for my own benefit mostly and to collate links, but perhaps others are interested too.

Janna Hastings has created a wiki for the New ChEBI Ontology project which includes links to the ChEBI ontology remediation notes and meeting notes from 9th July 2008.

You Know OBO? Let’s GO!

Oboe mechanics by starriseAccording to their website “The Open Biomedical Ontologies (OBO) Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain”. This week they are having a workshop in Cambridge, to bring myself up to speed, here is a quick name check of some of the people involved.

First ChEBI workshop, Day one

Great Chesterford
Some notes from day one of the first ChEBI workshop, 19th May 2008. There were four talks from Colin Batchelor (Royal Society of Chemistry), Ulrike Witting (EML Research GmbH Hiedelberg), Giles Weaver (Unilever) and Paula de Matos (EBI). Christoph Steinbeck has already written some ChEBI notes, these just add a little more detail. (more…)

BBC: Building a Better ChEBI

molecule by vabellon, on FlickrChemical Entitites of Biological Interest, ChEBI, is a freely available dictionary [1] of molecular entities, especially small chemical compounds. Like all big dictionaries and ontologies, it has its own unique challenges. Fortunately, those nice people at the EBI are holding a workshop to discuss future developments in ChEBI. In preparation for the workshop, here are some brief notes on how ChEBI could be made better. [Disclaimer: I’m fairly new to ChEBI and “thinking out loud” here, add comments below if I’ve said anything stupid or wrong]


