biocuration | O'Really?

April 30, 2010

Daniel Cohen on The Social Life of Digital Libraries

Filed under: biocuration,data mining,publishing — Duncan Hull @ 7:12 am
Tags: Arcadia, Cambridge, citeulike, Clare College, connotea, dancohen, Daniel Cohen, defrosting the digital library, digital library, Firefox, First Monday, George Mason University, GMU, John Naughton, Mekentosj, Mendeley, refworks, scholarometer, Zotero

Daniel Cohen is giving a talk in Cambridge today on The Social Life of Digital Libraries, abstract below:

The digitization of libraries had a clear initial goal: to permit anyone to read the contents of collections anywhere and anytime. But universal access is only the beginning of what may happen to libraries and researchers in the digital age. Because machines as well as humans have access to the same online collections, a complex web of interactions is emerging. Digital libraries are now engaging in online relationships with other libraries, with scholars, and with software, often without the knowledge of those who maintain the libraries, and in unexpected ways. These digital relationships open new avenues for discovery, analysis, and collaboration.

Daniel J. Cohen is an Associate Professor at George Mason University and has been involved in the development of the Zotero extension for the Firefox browser that enables users to manage bibliographic data while doing online research. Zotero [1] is one of many new tools [2] that are attempting to add a social dimension to scholarly information on the Web, so this should be an interesting talk.

If you’d like to come, the talk starts at 6pm in Clare College, Cambridge and you need to RSVP by email via the talks.cam.ac.uk page

References

Cohen, D.J. (2008). Creating scholarly tools and resources for the digital ecosystem: Building connections in the Zotero project. First Monday 13 (8)
Hull, D., Pettifer, S., & Kell, D. (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web PLoS Computational Biology, 4 (10) DOI: 10.1371/journal.pcbi.1000204

November 5, 2009

Artemether: Entity of the Month

Filed under: biocuration — Duncan Hull @ 1:27 pm
Tags: antimalarial, artemether, artemisinin, bioinformatics, cancer, ChEBI, chembl, dihyroartemisinin, entity of the month, european bioinformatics institute, Gene Ontology, hemozoin, John Overington, Lipinski, lumefantrine, malaria, Plasmodium falciparum, Wendy Warr

November’s entity of the month at ChEBI is the antimalarial drug Artemether. This accompanies release 62 of ChEBI, not just yet another incremental release but an increase of more than twentyfold in the number of entities in ChEBI, thanks to merging of data between an updated ChEBI [1] and ChEMBL [2]. ChEBI now (as of release 62) has over 455,000 total entities, compared to just under 19,000 in the previous version (release 61), see ChEBI news for details. The text below on Artemether is reproduced from the ChEBI website, where content is available under a Creative Commons license:

Artemether (CHEBI:195280) is a lipid-soluble antimalarial for the treatment of multi-drug resistant strains of Plasmodium falciparum malaria. First prepared in 1979 [3], it is a methyl ether of the naturally occurring sesquiterpene lactone (+)-artemisinin, which is isolated from the leaves of Artemisia annua L. (sweet wormwood), the traditional Chinese medicinal herb known as Qinghao. However, because of artemether’s extremely rapid mode of action (it has an elimination half-life of only 2 hours, being metabolized to dihydroartemisinin which then undergoes rapid clearance), it is used in combination with other, longer-acting, drugs. One such combination, licensed in April of this year by the WHO, is Coartem in which the artemether is mixed with lumefantrine – a racemic mixture of a synthetic fluorene derivative known formerly as benflumetol – which has a much longer and pharmacologically complementary terminal half-life of 3–6 days, allowing the two drugs to act synergistically against Plasmodium.

The molecule of artemether is interesting because of its extreme rigidity, with very few rotational bonds. Unlike quinine class antimalarial drugs, it has no nitrogen atom in its skeleton. However, an important chemical feature (and unique in drugs) is the presence of an O–O endoperoxide bridge which is essential for its antimalarial activity, as it is this bridge which is split in an interaction with heme, blocking the conversion into hemozoin and thus releasing into the parasite heme and a host of free radicals which attack the cell membrane.

Artemether is fully Rule-of-Five compliant and has recently also been under investigation as a possible candidate for cancer treatment [4,5].

GO ChEBI!

References

de Matos, P., Alcantara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., & Steinbeck, C. (2009). Chemical Entities of Biological Interest: an update Nucleic Acids Research DOI: 10.1093/nar/gkp886
Warr, W. (2009). ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI) Journal of Computer-Aided Molecular Design, 23 (4), 195-198 DOI: 10.1007/s10822-009-9260-9
Li, Y. et al. (1979) K’o Hsueh T’ung Pao, 24, 667 [Chem. Abstr., 91, 211376u].
Singh, N., & Panwar, V. (2006). Case Report of a Pituitary Macroadenoma Treated With Artemether Integrative Cancer Therapies, 5 (4), 391-394 DOI: 10.1177/1534735406295311
Wu, Z., Gao, C., Wu, Y., Zhu, Q., Yan Chen, ., Xin Liu, ., & Chuen Liu, . (2009). Inhibitive Effect of Artemether on Tumor Growth and Angiogenesis in the Rat C6 Orthotopic Brain Gliomas Model Integrative Cancer Therapies, 8 (1), 88-92 DOI: 10.1177/1534735408330714

Comments (4)

June 4, 2009

Improving the OBO Foundry Principles

Filed under: biocuration,data mining,informatics,semweb — Duncan Hull @ 1:48 pm
Tags: Alan Ruttenberg, Allyson Lister, Barry Smith, bbsrc, Bioportal, ChEBI, Chris Mungall, ebi, Frank Gibson, frolleague, Gene Ontology, Mark Musen, Melanie Courtot, Michael Ashburner, Michel Dumontier, nactem, OBO, OBO Foundry, OBO Smithy, OBO Workshop, obology, old smithy, ontology, ontolojoke, owl, principles, pubmed, REFINE, Richard Scheuermann, sbml, Suzi Lewis, ten commandments, workshop

The Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
The ontology provider has procedures for identifying distinct successive versions.
The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
The ontology is well documented.
The ontology has a plurality of independent users.
The ontology will be developed collaboratively with other OBO Foundry members.

I’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

CC-licensed picture above of the Old Smithy (pub) by Loop Oh. Inspired by Michael Ashburner‘s standing OBO joke (Ontolojoke) which goes something like this: Because Barry Smith is one of the leaders of OBO, should the project be called the OBO Smithy or the OBO Foundry? 🙂

References

Noy, N., Shah, N., Whetzel, P., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D., Storey, M., Chute, C., & Musen, M. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse Nucleic Acids Research DOI: 10.1093/nar/gkp440
Côté, R., Jones, P., Apweiler, R., & Hermjakob, H. (2006). The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries BMC Bioinformatics, 7 (1) DOI: 10.1186/1471-2105-7-97
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346
Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., & Rosse, C. (2005). Relations in biomedical ontologies Genome Biology, 6 (5) DOI: 10.1186/gb-2005-6-5-r46
Bada, M., & Hunter, L. (2008). Identification of OBO nonalignments and its implications for OBO enrichment Bioinformatics, 24 (12), 1448-1455 DOI: 10.1093/bioinformatics/btn194

Comments (3)

June 1, 2009

Scott Marshall on Interoperability

Filed under: biocuration,seminars,semweb — Duncan Hull @ 9:36 am
Tags: Bio2RDF, caBIG, CDISC, Concept Web Alliance, HCLS, hclsig, HL7, myexperiment, nactem, NCBO, neurocommons, ontology, Ontotext, PRISM, Scott Marshall, w3c, word wide web consortium

Scott Marshall is visiting Manchester this week, he will be doing a seminar on Friday 5th June, here are some details for anyone who is interested in attending:

Speaker: Dr. M. Scott Marshall, The University of Amsterdam

Date/Time: 5th June 2009, 11:00

Location: Room MLG.001 (Lecture Theatre), MIB building, (number 16 on campus map)

Title: Standards Enabled Interoperability: W3C Semantic Web for Health Care and Life Sciences Interest Group

Abstract: The W3C Semantic Web for Health Care and Life Sciences Interest Group (HCLS) has the mission of developing, advocating for, and supporting the use of Semantic Web technologies for biological science, translational medicine and health care. HCLS covers hot topics including data integration and federation, bridging commonly used domain standards such as CDISC and HL7, and the applications of medical terminologies. This talk will introduce the HCLS, as well as provide an overview of the activities that are currently ongoing within the task forces, as well as new developments and the recent Face2Face meeting. The role of information extraction and the current interest in Shared Identifiers will also be discussed.

References

Ruttenberg, A., Rees, J., Samwald, M., & Marshall, M. (2009). Life sciences on the Semantic Web: the Neurocommons and beyond Briefings in Bioinformatics, 10 (2), 193-204 DOI: 10.1093/bib/bbp004

May 6, 2009

Michel Dumontier on Representing Biochemistry

Filed under: biocuration,informatics,seminars,semweb — Duncan Hull @ 9:13 am
Tags: big data, biochemistry, biocurator, bioinformatics, ChEBI, ChemAxiom, cheminformatics, curation, database, drug discovery, Gene Ontology, ISWC, Michel Dumontier, ontology, owl, semantic web

Michel Dumontier is visiting Manchester this week, he will be doing a seminar on Monday 11th of May, here are some details for anyone who is interested in attending:

Title: Increasingly Accurate Representation of Biochemistry

Speaker: Michel Dumontier, dumontierlab.com

Time: 14.00, Monday 11th May 2009
Venue: Atlas 1, Kilburn Building, University of Manchester, number 39 on the Google Campus Map

Abstract: Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate manner. A fundamental starting point is biochemical identity, but our current approach for generating identifiers is haphazard and consequently integrating data is error-prone. I will discuss plausible structure-based strategies for biochemical identity whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups) such that identifiers may be generated in an automatic and curator/database independent manner. With structure-based identifiers in hand, we will be in a position to more accurately capture context-specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, our current representation of biochemical knowledge may improve such that manual and automatic methods of biocuration are substantially more accurate.

Update: Slides are now available via SlideShare.

[Creative Commons licensed picture of Michel in action at ISWC 2008 from Tom Heath]

References

Michel Dumontier and Natalia Villanueva-Rosales (2009) Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics DOI:10.1093/bib/bbn056
Doug Howe et al (2008) Big data: The future of biocuration Nature 455, 47-50 doi:10.1038/455047a

O'Really?

April 30, 2010

Daniel Cohen on The Social Life of Digital Libraries

References

November 5, 2009

Artemether: Entity of the Month

References

June 4, 2009

Improving the OBO Foundry Principles

References

June 1, 2009

Scott Marshall on Interoperability

References

May 6, 2009

Michel Dumontier on Representing Biochemistry

References

Meta / μετά