O'Really?

February 12, 2010

The 3rd OBO Foundry Workshop 2010, Cambridge, UK

Ultrawide Wellcome Trust Genome Campus, Cambridge by Tim NugentThe Open Biomedical Ontologies (OBO) [1] are a set of reference ontologies for describing all kinds of biomedical data shared in a centralised repository called The OBO Foundry. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop two years ago, the second workshop last year it’s already time for the third workshop on February 15th-16th. All the details and agenda are here if you’re interested. This workshop is possible thanks to sponsorship from the BBSRC funds for Workshop on Data Standards and the EU ELIXIR ‘Data Integration & Interoperability’ Package 7.

[Update: outcomes from the workshop are available here, along with a summary of discussion from Monday and a summary of discussion from Tuesday.]

References

  1. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346

[Ultrawide panoramic picture of the Wellcome Trust Genome Campus by Tim Nugent, as featured on the cover of the EMBL-EBI Annual Scientific Report 2009. Making those pictures looks like a lot of fun.]

February 5, 2010

Classic paper: Montagues and Capulets in Science

Romeo and Juliet by HappyHippoSnacksIn preparation for a joint seminar I’ll be doing with Midori Harris here at the EBI, here’s a classic paper [1,2] on the social problems of building biomedical ontologies. This paper is worth reading (or re-reading) because it makes lots of relevant points about the use and abuse of research and how people misunderstand each other [3]. It’s funny (and available Open Access too) plus how many papers do you read with an abstract written in the style of Big Bard Bill Shakespeare?

ABSTRACT: Two households, both alike in dignity, In fair Genomics, where we lay our scene, (One, comforted by its logic’s rigour, Claims ontology for the realm of pure, The other, with blessed scientist’s vigour, Acts hastily on models that endure), From ancient grudge break to new mutiny, When ‘being’ drives a fly-man to blaspheme. From forth the fatal loins of these two foes, Researchers to unlock the book of life; Whole misadventured piteous overthrows, Can with their work bury their clans’ strife. The fruitful passage of their GO-mark’d love, And the continuance of their studies sage, Which, united, yield ontologies undreamed-of, Is now the hour’s traffic of our stage; The which if you with patient ears attend, What here shall miss, our toil shall strive to mend.

So if you read the paper, you have to ask yourself, are you a Montague or a Capulet?

References

  1. Carole Goble and Chris Wroe (2004). The Montagues and the Capulets Comparative and Functional Genomics, 5 (8), 623-632 DOI: 10.1002/cfg.442
  2. Carole Goble (2004) The Capulets and Montagues: A plague on both your houses?, SOFG: Standards and Ontologies for Functional Genomics
  3. William Shakespeare (1596) Romeo and Juliet

[Romeo and Juliet picture via Happy Hippo Snacks]

January 21, 2010

Blogging a Book about Bio-Ontologies

Waterloo Station Ultrawide Panoramic by Tim NugentIf you wanted to write a guide to Biomedical and Biological Ontologies [1], especially the what, why, when, how, where and who, there are at least three choices for publishing your work:

  1. Journal publishing in your favourite scientific journal.
  2. Book publishing with your favourite academic or technical publisher.
  3. Self publishing on a web blog with your favourite blogging software.

Each of these has its own unique problems:

  • The trouble with journals is that they typically don’t publish “how to” guides, although you might be able to publish some kind of review.
  • The trouble with books, and academic books in particular, is that people (and machines) often don’t read them. Also, academic books can be prohibitively expensive to buy and this can make the data inside them less visible and accessible to the widest audience. Unfortunately all that lovely knowledge gets locked up behind publishers paywalls. To add insult to injury, most academic books take a very long time to publish, often several years. By the time of printing, the content of many academic books is often very dated.
  • The trouble with blogs, they aren’t peer-reviewed in the traditional way and they tend to be written by a single person from a not very neutral point of view. Or as Dave once put it “vanity publishing for arrogant people with an inflated ego“. Ouch.

So the people behind the Ontogenesis network (Robert Stevens and Phillip Lord with funding from the EPSRC grant ref: EP/E021352/1) had an idea. Why not blog a book about Ontology? As a publishing experiment – it might just work by combining the merits of books and blogs together in order to overcome their shortcomings. This will involve getting a small group of about twenty people (mostly bio-ontologists) together, and writing about what an ontology is, why you would want to a biomedical ontology, how to build one and so on. We will be doing some of the peer-review online too.

As part of an ongoing experiment, we are posting all this information on a blog called http://ontogenesis.knowledgeblog.org if you’d like to follow, subscribe to the feed and read the manifesto.

References

  1. Yu, A. (2006). Methods in biomedical ontology Journal of Biomedical Informatics, 39 (3), 252-266 DOI: 10.1016/j.jbi.2005.11.006

[Ultrawide panoramic picture of Waterloo station by Tim Nugent]

November 24, 2009

Semantic Web Applications and Tools for the Life Sciences (SWAT4LS) 2009, Amsterdam

Snow in Amsterdam by Bas van GaalenLast Friday, the Centrum Wiskunde & Informatica (CWI) in Amsterdam hosted a workshop called Semantic Web Applications and Tools for the Life Sciences (SWAT4LS) 2009.

Following on from last year [1], the workshop proceedings will be published at ceur-ws.org and in a special issue of the Journal of Biomedical Semantics, but if you want to find out what happened in the meantime, take a look at the #swat4ls2009 hashtag on twitter. Twitter makes bloggers lazy (they blog less but tweet more), but thankfully Nico Adams has studiously blogged the workshop very extensively.

Disruptive Technologies Director (cool job title!) Anita de Waard from Elsevier was asking what were the conclusions of the workshop. So here is an incomplete summary: Roughly speaking, people agreed to disagree (again). Keynote speaker Barend Mons argued that redundant data should be eliminated through the use of “nano-publications” and micro-attribution in his entertaining but controversial keynote. Some people in the audience disagreed with this. Greg Tyrelle thinks that redundancy is a feature, not a bug, in the Web and we have to deal with it. Alan Ruttenberg argued that semantic web reasoners  are required to clean up and sanity check all the messy and noisy biological data but emphasised the importance of Computer Scientists learning to speak Biologists language.

The good thing about this workshop is its size: small, friendly but internationally attended. Thanks to M. Scott Marshall, Albert Burger, Adrian Paschke, Paolo Romano and Andrea Splendiani for organising another good workshop, hope to see you again next year (if not before).

References

  1. Burger, A., Romano, P., Paschke, A., & Splendiani, A. (2009). Semantic Web Applications and Tools for Life Sciences, 2008 – Introduction BMC Bioinformatics, 10 (Suppl 10) DOI: 10.1186/1471-2105-10-S10-S1 part of the special issue on SWAT4LS 2008

[CC-licensed picture of Amsterdam in the snow by Bas van Gaalen]

June 16, 2009

OBO Foundry workshop outcomes 2009

Filed under: conferences — Duncan Hull @ 4:28 pm
Tags: , , , ,

Haystack OWL by dullhunkWell I was going to blog about last weeks Open Biomedical Ontologies workshop, but Susanna-Assunta Sansone at the EBI has already done it via some very detailed minutes. See her notes for the:

  1. Overview
  2. Outcomes from day one
  3. Outcomes from day two

Thanks to the organisers of this workshop for hosting another well run event, I’m only sorry I had to miss the delicious looking dinner at Cotto in Cambridge (and entertaining company) on the last day…  Hope to see you again next year.

References

  1. Schober, D., Smith, B., Lewis, S., Kusnierczyk, W., Lomax, J., Mungall, C., Taylor, C., Rocca-Serra, P., & Sansone, S. (2009). Survey-based naming conventions for use in OBO Foundry ontology development BMC Bioinformatics, 10 (1) DOI: 10.1186/1471-2105-10-125

[CC-licensed Picture of Haystack OWL by dullhunk].

June 4, 2009

Improving the OBO Foundry Principles

The Old Smithy Pub by loop ohThe Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

  1. The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
  2. The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
  3. The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
  4. The ontology provider has procedures for identifying distinct successive versions.
  5. The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
  6. The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
  7. The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
  8. The ontology is well documented.
  9. The ontology has a plurality of independent users.
  10. The ontology will be developed collaboratively with other OBO Foundry members.

ResearchBlogging.orgI’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

CC-licensed picture above of the Old Smithy (pub) by Loop Oh. Inspired by Michael Ashburner‘s standing OBO joke (Ontolojoke) which goes something like this: Because Barry Smith is one of the leaders of OBO, should the project be called the OBO Smithy or the OBO Foundry? 🙂

References

  1. Noy, N., Shah, N., Whetzel, P., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D., Storey, M., Chute, C., & Musen, M. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse Nucleic Acids Research DOI: 10.1093/nar/gkp440
  2. Côté, R., Jones, P., Apweiler, R., & Hermjakob, H. (2006). The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries BMC Bioinformatics, 7 (1) DOI: 10.1186/1471-2105-7-97
  3. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346
  4. Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., & Rosse, C. (2005). Relations in biomedical ontologies Genome Biology, 6 (5) DOI: 10.1186/gb-2005-6-5-r46
  5. Bada, M., & Hunter, L. (2008). Identification of OBO nonalignments and its implications for OBO enrichment Bioinformatics, 24 (12), 1448-1455 DOI: 10.1093/bioinformatics/btn194

April 17, 2009

The Unreasonable Effectiveness of Google

GoogleVia the Official Google Research Blog at the University of Google, Alon Halevy, Peter Norvig and Fernando Pereira have published an interesting expert opinion piece in the  March/April 2009 edition of IEEE Intelligent Systems: computer.org/intelligent. The paper talks about embracing complexity and making use of the “the unreasonable effectiveness of data” [1] drawing analogies with the “unreasonable effectiveness of mathematics” [2]. There is plenty to agree and disagree with in this provocative article which makes it an entertaining read. So what can we learn from those expert Googlers in the Googleplex? (more…)

December 2, 2008

SWAT4LS: The Semantic Web in Scotland

James Clerk MaxwellLast Friday, the UK National e-Science Centre in Edinburgh hosted a workhop, Semantic Web Applications and Tools for the Life Sciences (see SWAT4LS.org for the full details). Here are some incomplete and abbreviated notes from the workshop where there were some interesting people, paperware and software.

People and Paperware

70 people registered to attend SWAT4LS in total, many familiar names and faces, plus some new people I’ve never met before: (more…)

July 25, 2008

How to spend a £400 million Science budget

A thought experiment with lots of money

The Queens Ahead by canonsnapperThe Biotechnology and Biological Sciences Research Council (BBSRC) is the United Kingdom’s funding agency for academic research and training in the non-clinical life sciences. It supports a total of around 1600 scientists and 2000 research students in universities and institutes in the UK. The head of our laboratory, Douglas Kell, has recently been appointed Chief Executive of the BBSRC [1]. Congratulations Doug, we wish you the very best in your new job. Now, according to bbsrc.ac.uk, their annual budget is a cool £400 million (just short of $800 million or €500 million). This has left me wondering, how would you spend a £400 million Science budget for the life sciences? For the purposes of this article, imagine it was you that had been put in charge of said budget, and Prime Minister Gordon Brown (texture like sun) had given you, yes YOU, a big bag of cash to distribute as you see fit. A mouth-watering prospect, I think you’ll agree. Here, is my personal opinion of how, in my dreams, I would spend the money. (more…)

July 6, 2008

You Know OBO? Let’s GO!

Oboe mechanics by starriseAccording to their website “The Open Biomedical Ontologies (OBO) Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain”. This week they are having a workshop in Cambridge, to bring myself up to speed, here is a quick name check of some of the people involved.

Blog at WordPress.com.