OBO | O'Really?

February 12, 2010

The 3rd OBO Foundry Workshop 2010, Cambridge, UK

Filed under: ChEBI — Duncan Hull @ 4:12 pm
Tags: Alan Ruttenberg, Barry Smith, bbsrc, Chris Mungall, elixir, European Bioinformatics Insitute, Gene Ontology, Michael Ashburner, OBO, OBO Foundry, ontology, Open Biomedical Ontologies, owl, Philippe Rocca-Serra, Richard Scheuermann, Susanna-Assunta Sansone, Suzanna Lewis, Tim Nugent, ultrawide

The Open Biomedical Ontologies (OBO) [1] are a set of reference ontologies for describing all kinds of biomedical data shared in a centralised repository called The OBO Foundry. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop two years ago, the second workshop last year it’s already time for the third workshop on February 15th-16th. All the details and agenda are here if you’re interested. This workshop is possible thanks to sponsorship from the BBSRC funds for Workshop on Data Standards and the EU ELIXIR ‘Data Integration & Interoperability’ Package 7.

[Update: outcomes from the workshop are available here, along with a summary of discussion from Monday and a summary of discussion from Tuesday.]

References

Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346

[Ultrawide panoramic picture of the Wellcome Trust Genome Campus by Tim Nugent, as featured on the cover of the EMBL-EBI Annual Scientific Report 2009. Making those pictures looks like a lot of fun.]

Leave a Comment

February 5, 2010

Classic paper: Montagues and Capulets in Science

Filed under: seminars — Duncan Hull @ 12:53 pm
Tags: Baz Luhrmann, big bard bill, capulets, Carole Goble, ChEBI, chris wroe, Claire Danes, Gene Ontology, genomics, leonardo dicaprio, Michael Ashburner, Midori Harris, montagues, OBO, ontology, owl, romeo and juliet, shakespeare, SOFG, william shakespeare

In preparation for a joint seminar I’ll be doing with Midori Harris here at the EBI, here’s a classic paper [1,2] on the social problems of building biomedical ontologies. This paper is worth reading (or re-reading) because it makes lots of relevant points about the use and abuse of research and how people misunderstand each other [3]. It’s funny (and available Open Access too) plus how many papers do you read with an abstract written in the style of Big Bard Bill Shakespeare?

ABSTRACT: Two households, both alike in dignity, In fair Genomics, where we lay our scene, (One, comforted by its logic’s rigour, Claims ontology for the realm of pure, The other, with blessed scientist’s vigour, Acts hastily on models that endure), From ancient grudge break to new mutiny, When ‘being’ drives a fly-man to blaspheme. From forth the fatal loins of these two foes, Researchers to unlock the book of life; Whole misadventured piteous overthrows, Can with their work bury their clans’ strife. The fruitful passage of their GO-mark’d love, And the continuance of their studies sage, Which, united, yield ontologies undreamed-of, Is now the hour’s traffic of our stage; The which if you with patient ears attend, What here shall miss, our toil shall strive to mend.

So if you read the paper, you have to ask yourself, are you a Montague or a Capulet?

References

Carole Goble and Chris Wroe (2004). The Montagues and the Capulets Comparative and Functional Genomics, 5 (8), 623-632 DOI: 10.1002/cfg.442
Carole Goble (2004) The Capulets and Montagues: A plague on both your houses?, SOFG: Standards and Ontologies for Functional Genomics
William Shakespeare (1596) Romeo and Juliet

[Romeo and Juliet picture via Happy Hippo Snacks]

Comments (1)

January 21, 2010

Blogging a Book about Bio-Ontologies

Filed under: ChEBI,publishing — Duncan Hull @ 11:05 am
Tags: Alexander Yu, bio-ontology, blogging a book, EP/E021352/1, epsrc, knowledgeblog, OBO, ontogenesis, Open Access, owl, Phillip Lord, Protégé, Robert Stevens, Tim Nugent, ultrawide

If you wanted to write a guide to Biomedical and Biological Ontologies [1], especially the what, why, when, how, where and who, there are at least three choices for publishing your work:

Journal publishing in your favourite scientific journal.
Book publishing with your favourite academic or technical publisher.
Self publishing on a web blog with your favourite blogging software.

Each of these has its own unique problems:

The trouble with journals is that they typically don’t publish “how to” guides, although you might be able to publish some kind of review.
The trouble with books, and academic books in particular, is that people (and machines) often don’t read them. Also, academic books can be prohibitively expensive to buy and this can make the data inside them less visible and accessible to the widest audience. Unfortunately all that lovely knowledge gets locked up behind publishers paywalls. To add insult to injury, most academic books take a very long time to publish, often several years. By the time of printing, the content of many academic books is often very dated.
The trouble with blogs, they aren’t peer-reviewed in the traditional way and they tend to be written by a single person from a not very neutral point of view. Or as Dave once put it “vanity publishing for arrogant people with an inflated ego“. Ouch.

So the people behind the Ontogenesis network (Robert Stevens and Phillip Lord with funding from the EPSRC grant ref: EP/E021352/1) had an idea. Why not blog a book about Ontology? As a publishing experiment – it might just work by combining the merits of books and blogs together in order to overcome their shortcomings. This will involve getting a small group of about twenty people (mostly bio-ontologists) together, and writing about what an ontology is, why you would want to a biomedical ontology, how to build one and so on. We will be doing some of the peer-review online too.

As part of an ongoing experiment, we are posting all this information on a blog called http://ontogenesis.knowledgeblog.org if you’d like to follow, subscribe to the feed and read the manifesto.

References

Yu, A. (2006). Methods in biomedical ontology Journal of Biomedical Informatics, 39 (3), 252-266 DOI: 10.1016/j.jbi.2005.11.006

[Ultrawide panoramic picture of Waterloo station by Tim Nugent]

Comments (6)

November 24, 2009

Semantic Web Applications and Tools for the Life Sciences (SWAT4LS) 2009, Amsterdam

Filed under: conferences,semweb — Duncan Hull @ 12:30 am
Tags: Adrian Paschke, Alan Ruttenberg, Albert Burger, amsterdam, Andrea Splendiani, Anita de Waard, barend mons, Journal of Biomedical Semantics, Michael Schroeder, OBO, ontology, owl, Paolo Romano, rdf, semantic web, swat4ls, swat4ls proceedings

Last Friday, the Centrum Wiskunde & Informatica (CWI) in Amsterdam hosted a workshop called Semantic Web Applications and Tools for the Life Sciences (SWAT4LS) 2009.

Following on from last year [1], the workshop proceedings will be published at ceur-ws.org and in a special issue of the Journal of Biomedical Semantics, but if you want to find out what happened in the meantime, take a look at the #swat4ls2009 hashtag on twitter. Twitter makes bloggers lazy (they blog less but tweet more), but thankfully Nico Adams has studiously blogged the workshop very extensively.

Disruptive Technologies Director (cool job title!) Anita de Waard from Elsevier was asking what were the conclusions of the workshop. So here is an incomplete summary: Roughly speaking, people agreed to disagree (again). Keynote speaker Barend Mons argued that redundant data should be eliminated through the use of “nano-publications” and micro-attribution in his entertaining but controversial keynote. Some people in the audience disagreed with this. Greg Tyrelle thinks that redundancy is a feature, not a bug, in the Web and we have to deal with it. Alan Ruttenberg argued that semantic web reasoners are required to clean up and sanity check all the messy and noisy biological data but emphasised the importance of Computer Scientists learning to speak Biologists language.

The good thing about this workshop is its size: small, friendly but internationally attended. Thanks to M. Scott Marshall, Albert Burger, Adrian Paschke, Paolo Romano and Andrea Splendiani for organising another good workshop, hope to see you again next year (if not before).

References

Burger, A., Romano, P., Paschke, A., & Splendiani, A. (2009). Semantic Web Applications and Tools for Life Sciences, 2008 – Introduction BMC Bioinformatics, 10 (Suppl 10) DOI: 10.1186/1471-2105-10-S10-S1 part of the special issue on SWAT4LS 2008

[CC-licensed picture of Amsterdam in the snow by Bas van Gaalen]

Comments (3)

June 16, 2009

OBO Foundry workshop outcomes 2009

Filed under: conferences — Duncan Hull @ 4:28 pm
Tags: Cotto, ebi, OBO, OBO Foundry, Susanna-Assunta Sansone

Well I was going to blog about last weeks Open Biomedical Ontologies workshop, but Susanna-Assunta Sansone at the EBI has already done it via some very detailed minutes. See her notes for the:

Thanks to the organisers of this workshop for hosting another well run event, I’m only sorry I had to miss the delicious looking dinner at Cotto in Cambridge (and entertaining company) on the last day… Hope to see you again next year.

References

Schober, D., Smith, B., Lewis, S., Kusnierczyk, W., Lomax, J., Mungall, C., Taylor, C., Rocca-Serra, P., & Sansone, S. (2009). Survey-based naming conventions for use in OBO Foundry ontology development BMC Bioinformatics, 10 (1) DOI: 10.1186/1471-2105-10-125

[CC-licensed Picture of Haystack OWL by dullhunk].

Leave a Comment

June 4, 2009

Improving the OBO Foundry Principles

Filed under: biocuration,data mining,informatics,semweb — Duncan Hull @ 1:48 pm
Tags: Alan Ruttenberg, Allyson Lister, Barry Smith, bbsrc, Bioportal, ChEBI, Chris Mungall, ebi, Frank Gibson, frolleague, Gene Ontology, Mark Musen, Melanie Courtot, Michael Ashburner, Michel Dumontier, nactem, OBO, OBO Foundry, OBO Smithy, OBO Workshop, obology, old smithy, ontology, ontolojoke, owl, principles, pubmed, REFINE, Richard Scheuermann, sbml, Suzi Lewis, ten commandments, workshop

The Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
The ontology provider has procedures for identifying distinct successive versions.
The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
The ontology is well documented.
The ontology has a plurality of independent users.
The ontology will be developed collaboratively with other OBO Foundry members.

I’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

CC-licensed picture above of the Old Smithy (pub) by Loop Oh. Inspired by Michael Ashburner‘s standing OBO joke (Ontolojoke) which goes something like this: Because Barry Smith is one of the leaders of OBO, should the project be called the OBO Smithy or the OBO Foundry? 🙂

References

Noy, N., Shah, N., Whetzel, P., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D., Storey, M., Chute, C., & Musen, M. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse Nucleic Acids Research DOI: 10.1093/nar/gkp440
Côté, R., Jones, P., Apweiler, R., & Hermjakob, H. (2006). The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries BMC Bioinformatics, 7 (1) DOI: 10.1186/1471-2105-7-97
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346
Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., & Rosse, C. (2005). Relations in biomedical ontologies Genome Biology, 6 (5) DOI: 10.1186/gb-2005-6-5-r46
Bada, M., & Hunter, L. (2008). Identification of OBO nonalignments and its implications for OBO enrichment Bioinformatics, 24 (12), 1448-1455 DOI: 10.1093/bioinformatics/btn194

Comments (3)

April 17, 2009

The Unreasonable Effectiveness of Google

Filed under: Googleology — Duncan Hull @ 4:00 pm
Tags: Adam Kilgarriff, Alistair Miles, Allyson Lister, Alon Halevy, Andrew Clegg, Artificial Intelligence, bioformats, Biomodels, bootstrep, ChEBI, David Shotton, Dietrich Rebholz-Schuhmann, Eugene Wigner, Fernando Pereira, Frank van Harmelen, Gene Ontology, Googleology, Googleplex, Jim Hendler, Larry Page, Michael Uschold, Nicolas le Novère, OBO, Opinion, Ora Lassila, Peter Norvig, provocative, pubmed, PubMedCentral, reasoner, Reasoning, sbml, scifoo, Sergey Brin, Steffano Mazzocchi, Tim Berners-Lee, unreasonable

Via the Official Google Research Blog at the University of Google, Alon Halevy, Peter Norvig and Fernando Pereira have published an interesting expert opinion piece in the March/April 2009 edition of IEEE Intelligent Systems: computer.org/intelligent. The paper talks about embracing complexity and making use of the “the unreasonable effectiveness of data” [1] drawing analogies with the “unreasonable effectiveness of mathematics” [2]. There is plenty to agree and disagree with in this provocative article which makes it an entertaining read. So what can we learn from those expert Googlers in the Googleplex? (more…)

Comments (5)

December 2, 2008

SWAT4LS: The Semantic Web in Scotland

Filed under: semweb — Duncan Hull @ 11:57 am
Tags: Albert Burger, Alistair Miles, Annotate, BioDAS, Christopher Baker, DAS, e-science, Edinburgh, GoWeb, Gudmundur Thorisson, Heiko Dietze, James Procter, Nadia Anwar, NESC, Nico Adams, Norman Paskin, OBO, owl, Phil Lord, Sconny Botland, Scotland, SKOS, swat4ls, textensor

Last Friday, the UK National e-Science Centre in Edinburgh hosted a workhop, Semantic Web Applications and Tools for the Life Sciences (see SWAT4LS.org for the full details). Here are some incomplete and abbreviated notes from the workshop where there were some interesting people, paperware and software.

People and Paperware

Nadia Anwar presented a paper on Semantic Data Integration for Francisella tularensi Proteomic and Genomic Data. This described experiences of developing a system for converting proteomics data in Excel to RDF, and what the benefits were.
Heiko Dietze presented a paper on GoWeb: A semantic search engine for the life science web. Have a look at GoWeb query for the hexokinase enzyme for an example of the currently capability of GoWeb.
Simon Jupp presented a paper on Knowledge Representation for Web Navigation which described why the Simple Knowledge Organisation System (SKOS) is sometimes a more suitable language for modelling knowledge than OWL and RDF – see also SKOS in the context of Semantic Web Deployment by Alistair Miles

70 people registered to attend SWAT4LS in total, many familiar names and faces, plus some new people I’ve never met before: (more…)

Leave a Comment

July 25, 2008

How to spend a £400 million Science budget

Filed under: biotech,funding,politics — Duncan Hull @ 11:45 am
Tags: Alan Sugar, bbsrc, bioinformatics, biotech, Bob Geldof, cheminformatics, DIUS, DNA mania, Douglas Kell, fantasy, fellowship, Gordon Brown, Ian Pearson, John Denham, NIH, OBO, Open Access, Peter Suber, PubMedCentral, sciblog, Science, Stevan Harnad, systems biology, The Apprentice, thought experiment

A thought experiment with lots of money

The Biotechnology and Biological Sciences Research Council (BBSRC) is the United Kingdom’s funding agency for academic research and training in the non-clinical life sciences. It supports a total of around 1600 scientists and 2000 research students in universities and institutes in the UK. The head of our laboratory, Douglas Kell, has recently been appointed Chief Executive of the BBSRC [1]. Congratulations Doug, we wish you the very best in your new job. Now, according to bbsrc.ac.uk, their annual budget is a cool £400 million (just short of $800 million or €500 million). This has left me wondering, how would you spend a £400 million Science budget for the life sciences? For the purposes of this article, imagine it was you that had been put in charge of said budget, and Prime Minister Gordon Brown (texture like sun) had given you, yes YOU, a big bag of cash to distribute as you see fit. A mouth-watering prospect, I think you’ll agree. Here, is my personal opinion of how, in my dreams, I would spend the money. (more…)

Comments (9)

July 6, 2008

You Know OBO? Let’s GO!

Filed under: informatics — Duncan Hull @ 11:20 am
Tags: Alan Ruttenberg, Barry Smith, biomedical, ChEBI, Chris Mungall, Dawn Field, ebi, Gene Ontology, GO, MIBBI, Michael Ashburner, OBO, OBO Workshop, Oboe, ontology, sbo, Suzanna Lewis

According to their website “The Open Biomedical Ontologies (OBO) Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain”. This week they are having a workshop in Cambridge, to bring myself up to speed, here is a quick name check of some of the people involved.

Michael Ashburner, University of Cambridge
Erick Antezana, University of Ghent
Colin Batchelor, Royal Society of Chemistry

(more…)

Leave a Comment

February 12, 2010

References

February 5, 2010

References

January 21, 2010

References

November 24, 2009

References

June 16, 2009

References

June 4, 2009

References

April 17, 2009

December 2, 2008

People and Paperware

July 25, 2008

A thought experiment with lots of money

July 6, 2008

Meta / μετά