O'Really?

September 4, 2009

XML training in Oxford

XML Summer School 2009The XML Summer School returns this year at St. Edmund Hall, Oxford from 20th-25th September 2009. As always, it’s packed with high quality technical training for every level of expertise, from the Hands-on Introduction for beginners through to special classes devoted to XQuery and XSLT, Semantic Technologies, Open Source Applications, Web 2.0, Web Services and Identity. The Summer School is also a rare opportunity to experience what life is like as a student in one of the world’s oldest university cities while enjoying a range of social events that are a part of the unique summer school experience.

This year, classes and sessions are taught and chaired by:

W3C XML 10th anniversaryThe Extensible Markup Language (XML) has been around for just over ten years, quickly and quietly finding its niche in many different areas of science and technology. It has been used in everything from modelling biochemical networks in systems biology [1], to electronic health records [2], scientific publishing, the provision of the PubMed service (which talks XML) [3] and many other areas. As a crude measure of its importance in biomedical science, PubMed currently has no fewer than 800 peer-reviewed publications on XML. It’s hard to imagine life without it. So whether you’re a complete novice looking to learn more about XML or a seasoned veteran wanting to improve your knowledge, register your place and find out more by visiting xmlsummerschool.com. I hope to see you there…

References

  1. Hucka, M. (2003). The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models Bioinformatics, 19 (4), 524-531 DOI: 10.1093/bioinformatics/btg015
  2. Bunduchi R, Williams R, Graham I, & Smart A (2006). XML-based clinical data standardisation in the National Health Service Scotland. Informatics in primary care, 14 (4) PMID: 17504574
  3. Sayers, E., Barrett, T., Benson, D., Bryant, S., Canese, K., Chetvernin, V., Church, D., DiCuccio, M., Edgar, R., Federhen, S., Feolo, M., Geer, L., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D., Madden, T., Maglott, D., Miller, V., Mizrachi, I., Ostell, J., Pruitt, K., Schuler, G., Sequeira, E., Sherry, S., Shumway, M., Sirotkin, K., Souvorov, A., Starchenko, G., Tatusova, T., Wagner, L., Yaschenko, E., & Ye, J. (2009). Database resources of the National Center for Biotechnology Information Nucleic Acids Research, 37 (Database) DOI: 10.1093/nar/gkn741

June 23, 2009

Impact Factor Boxing 2009

Fight Night Punch Test by djclear904[This post is part of an ongoing series about impact factors]

The latest results from the annual impact factor boxing world championship contest are out. This is a combat sport where scientific journals are scored according to their supposed influence and impact in Science. This years competition rankings include the first-ever update to the newly introduced Five Year Impact Factor and Eigenfactor™ Metrics [1,2] in Journal Citation Reports (JCR) on the Web (see www.isiknowledge.com/JCR warning: clunky website requires subscription*), presumably in response to widespread criticism of impact factors. The Eigenfactor™ seems to correlate quite closely with the impact factor scores, both of which work at the level of the journal, although they use different methods for measuring a given journals impact. However, what many authors are often more interested in is the impact of an individual article, not the journal where it was published. So it would be interesting to see how the figures below tally with Google Scholar, see also comments by Abhishek Tiwari. I’ve included a table below of bioinformatics impact factors, updated for June 2009. Of course, when I say 2009 (today), I mean 2008 (these are the latest figures available based on data from 2007) – so this shiny new information published this week is already out of date [3] and flawed [4,5] but here is a selection of the data anyway: [update: see figures published in June 2010.]

Journal Title 2008 data from isiknowledge.com/JCR Eigenfactor™ Metrics
Total Cites Impact Factor 5-Year Impact Factor Immediacy Index Articles Cited Half-life Eigenfactor™ Score Article Influence™ Score
BMC Bionformatics 8141 3.781 4.246 0.664 607 2.8 0.06649 1.730
OUP Bioinformatics 30344 4.328 6.481 0.566 643 4.8 0.18204 2.593
Briefings in Bioinformatics 2908 4.627 1.273 44 4.5 0.02188
PLoS Computational Biology 2730 5.895 6.144 0.826 253 2.1 0.03063 3.370
Genome Biology 9875 6.153 7.812 0.961 229 4.4 0.07930 3.858
Nucleic Acids Research 86787 6.878 6.968 1.635 1070 6.5 0.37108 2.963
PNAS 416018 9.380 10.228 1.635 3508 7.4 1.69893 4.847
Science 409290 28.103 30.268 6.261 862 8.4 1.58344 16.283
Nature 443967 31.434 31.210 8.194 899 8.5 1.76407 17.278

The internet is radically changing the way we communicate and this includes scientific publishing, as media mogul Rupert Murdoch once pointed out big will not beat small any more – it will be the fast beating the slow.  An interesting question for publishers and scientists is, how can the Web help the faster flyweight and featherweight boxers (smaller journals) compete and punch-above-their-weight with the reigning world champion heavyweights (Nature, Science and PNAS)? Will the heavyweight publishers always have the killer knockout punches? If you’ve got access to the internet, then you already have a ringside seat from which to watch all the action. This fight should be entertaining viewing and there is an awful lot of money riding on the outcome [6-11].

Seconds away, round two…

References

  1. Fersht, A. (2009). The most influential journals: Impact Factor and Eigenfactor Proceedings of the National Academy of Sciences, 106 (17), 6883-6884 DOI: 10.1073/pnas.0903307106
  2. Bergstrom, C., & West, J. (2008). Assessing citations with the Eigenfactor Metrics Neurology, 71 (23), 1850-1851 DOI: 10.1212/01.wnl.0000338904.37585.66
  3. Cockerill, M. (2004). Delayed impact: ISI’s citation tracking choices are keeping scientists in the dark. BMC Bioinformatics, 5 (1) DOI: 10.1186/1471-2105-5-93
  4. Allen, L., Jones, C., Dolby, K., Lynn, D., & Walport, M. (2009). Looking for Landmarks: The Role of Expert Review and Bibliometric Analysis in Evaluating Scientific Publication Outputs PLoS ONE, 4 (6) DOI: 10.1371/journal.pone.0005910
  5. Grant, R.P. (2009) On article-level metrics and other animals Nature Network
  6. Corbyn, Z. (2009) Do academic journals pose a threat to the advancement of Science? Times Higher Education
  7. Fenner, M. (2009) PLoS ONE: Interview with Peter Binfield Gobbledygook blog at Nature Network
  8. Hoyt, J. (2009) Who is killing science on the Web? Publishers or Scientists? Mendeley Blog
  9. Hull, D. (2009) Escape from the Impact Factor: The Great Escape? O’Really? blog
  10. Murray-Rust, P. (2009) THE article: Do academic journals pose a threat to the advancement of science? Peter Murray-Rust’s blog: A Scientist and the Web
  11. Wu, S. (2009) The evolution of Scientific Impact shirleywho.wordpress.com

* This important data should be freely available (e.g. no subscription), since crucial decisions about the allocation of public money depend on it, but that’s another story.

[More commentary on this post over at friendfeed. CC-licensed Fight Night Punch Test by djclear904]

June 17, 2009

Nettab 2009 Day Two: Wikis ‘n’ Workflows

Alex Bateman on the RNA WikiprojectThis is a  brief report and some links from the second day of Network Applications and Tools in Biology (NETTAB 2009) in Catania, Sicily. There were two keynotes on the RNA WikiProject [1] by Alex Bateman and myExperiment [2] (by me) as as well as presentations by (I think but I wasn’t concentrating enough) Dietlind Gerloff, Guiliano Armano, Frédéric Cadier and Leandro Ciuffo.

Alex Bateman (wikipedia user:Alexbateman) did an entertaining talk on the RNA wikiproject: Community annotation of RNA families where they have taken data from the Rfam database [3], and put it all into regular wikipedia. This project got quite a lot of media attention back in February. In this case, the primary advantages of “letting go of data” by giving it to wikipedia are that it is read by everyone who uses Google (where pages are frequently the top search result) and wikipedia gets lots more traffic than biological databases like rfam.sanger.ac.uk do. Thanks to wikirank which tells you what is popular on wikipedia, it is also possible to quickly compare the popularity of pages, see RNA vs. Ribosomal RNA vs Micro RNA vs SnoRNA for an example. The Rfam project have some interesting stats on who makes the most edits to the Rfam pages, it isn’t always the scientists who make important contributions, but anonymous users and machines (e.g. like Rfambot, Smackbot and Citation bot) who are often doing most of the hard work. There is a very long tail of contributors who make small contributions – which supports the 90% of users in on-line communities are lurkers who never contribute rule and is reminiscent of Citizen Science and Muggles. I wanted to put the slides from this talk on slideshare, but they contain some unpublished data. You can, however, subscribe to the feed of the Rfam and Pfam blog at xfam.wordpress.com, if you’d like to keep up to date on developments in this area.

After the keynote there were presentations by Dietlind Gerloff on Open Knowledge (a new agent-based infrastructure for bioinformatics experimentation – nice pictorial intro using lego here) and Guiliano Armano? on ProDaMa-C – a collaborative web application to generate specialised protein structure datasets.

The next keynote was on myexperiment.org, “Where Experimental Work Flows” – my slides on Who are you, Managing collaborative digital identities in bioinformatics with myexperiment are embedded below.

I followed this presentation with a live 30 minute demonstration and discussion of myexperiment. The most interesting question people asked was Why use OpenID instead of full blown Public Key Infrastructure? (answer: OpenID is currently a lot easier and provides good-enough security). The rest of the day is a bit of a blur, I’m with Tim Bray in enjoying the monster adrenaline high of public speaking, but with all that ChEBI:28918 coursing through my veins it can be difficult to think straight (immediately before, during or after a talk)… so you’ll have to take a look at the proceedings for the full details of what happened in the afternoon – but they included Make Histri (great name!), SBMM: Systems Biology Metabolic Modeling Assistant [4] by Ismael Navas-Delgado and Biomedical Applications of the EELA-2 project.

By the evening time, there was some Opera dei Pupi (traditional sicilian puppet theatre), a trip to Acireale and a delicious italian feast in a ristorante (the name of which I can’t remember) to round off an enjoyable day.

References

  1. Daub, J., Gardner, P., Tate, J., Ramskold, D., Manske, M., Scott, W., Weinberg, Z., Griffiths-Jones, S., & Bateman, A. (2008). The RNA WikiProject: Community annotation of RNA families RNA, 14 (12), 2462-2464 DOI: 10.1261/rna.1200508
  2. De Roure, D., & Goble, C. (2009). Software Design for Empowering Scientists IEEE Software, 26 (1), 88-95 DOI: 10.1109/MS.2009.22
  3. Gardner, P., Daub, J., Tate, J., Nawrocki, E., Kolbe, D., Lindgreen, S., Wilkinson, A., Finn, R., Griffiths-Jones, S., Eddy, S., & Bateman, A. (2009). Rfam: updates to the RNA families database Nucleic Acids Research, 37 (Database) DOI: 10.1093/nar/gkn766
  4. Reyes-Palomares, A., Montanez, R., Real-Chicharro, A., Chniber, O., Kerzazi, A., Navas-Delgado, I., Medina, M., Aldana-Montes, J., & Sanchez-Jimenez, F. (2009). Systems biology metabolic modeling assistant: an ontology-based tool for the integration of metabolic data in kinetic modeling Bioinformatics, 25 (6), 834-835 DOI: 10.1093/bioinformatics/btp061

June 10, 2009

Kenjiro Taura on Parallel Workflows

Kenjiro TauraKenjiro Taura is visting Manchester next week from the Department of Information and Communication Engineering at the University of Tokyo. He will be doing a seminar, the details of which are below:

Title: Large scale text processing made simple by GXP make: A Unixish way to parallel workflow processing

Date-time: Monday, 15 June 2009 at 11:00 AM

Location: Room MLG.001, mib.ac.uk

In the first part of this talk, I will introduce a simple tool called GXP make. GXP is a general purpose parallel shell (a process launcher) for multicore machines, unmanaged clusters accessed via SSH, clusters or supercomputers managed by batch scheduler, distributed machines, or any mixture thereof. GXP make is a ‘make‘ execution engine that executes regular UNIX makefiles in parallel. Make, though typically used for software builds, is in fact a general framework to concisely describe workflows constituting sequential commands. Installation of GXP requires no root privileges and needs to be done only on the user’s home machine. GXP make easily scales to more than 1,000 CPU cores. The net result is that GXP make allows an easy migration of workflows from serial environments to clusters and to distributed environments. In the second part, I will talk about our experiences on running a complex text processing workflow developed by Natural Language Processing (NLP) experts. It is an entire workflow that processes MEDLINE abstracts with deep NLP tools (e.g., Enju parser [1]) to generate search indices of MEDIE, a semantic retrieval engine for MEDLINE. It was originally described in Makefile without a particular provision to parallel processing, yet GXP make was able to run it on clusters with almost no changes to the original Makefile. Time for processing abstracts published in a single day was reduced from approximately eight hours (with a single machine) to twenty minutes with a trivial amount of efforts. A larger scale experiment of processing all abstracts published so far and remaining challenges will also be presented.

References

  1. Miyao, Y., Sagae, K., Saetre, R., Matsuzaki, T., & Tsujii, J. (2008). Evaluating contributions of natural language parsers to protein-protein interaction extraction Bioinformatics, 25 (3), 394-400 DOI: 10.1093/bioinformatics/btn631

June 4, 2009

Improving the OBO Foundry Principles

The Old Smithy Pub by loop ohThe Open Biomedical Ontologies (OBO) are a set of reference ontologies for describing all kinds of biomedical data, see [1-5] for examples. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop last year, the 2nd OBO workshop 2009 is fast approaching.

In preparation, I’ve been revisiting the OBO Foundry documentation, part of which establishes a set of principles for ontology development. I’m wondering how they could be improved because these principles are fundamental to the whole effort. We’ve been using one of the OBO ontologies (called Chemical Entities of Biological Interest (ChEBI)) in the REFINE project to mine data from the PubMed database. OBO Ontologies like ChEBI and the Gene Ontology are really crucial to making sense of the massive data which are now common in biology and medicine – so this is stuff that matters.

The OBO Foundry Principles, a sort of Ten Commandments of Ontology (or Obology if you prefer) currently look something like this (copied directly from obofoundry.org/crit.shtml):

  1. The ontology must be open and available to be used by all without any constraint other than (a) its origin must be acknowledged and (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers.The OBO ontologies are for sharing and are resources for the entire community. For this reason, they must be available to all without any constraint or license on their use or redistribution. However, it is proper that their original source is always credited and that after any external alterations, they must never be redistributed under the same name or with the same identifiers.
  2. The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. The reason for this is that the same tools can then be usefully applied. This facilitates shared software implementations. This criterion is not met in all of the ontologies currently listed, but we are working with the ontology developers to have them available in a common OBO syntax.
  3. The ontologies possesses a unique identifier space within the OBO Foundry. The source of a term (i.e. class) from any ontology can be immediately identified by the prefix of the identifier of each term. It is, therefore, important that this prefix be unique.
  4. The ontology provider has procedures for identifying distinct successive versions.
  5. The ontology has a clearly specified and clearly delineated content. The ontology must be orthogonal to other ontologies already lodged within OBO. The major reason for this principle is to allow two different ontologies, for example anatomy and process, to be combined through additional relationships. These relationships could then be used to constrain when terms could be jointly applied to describe complementary (but distinguishable) perspectives on the same biological or medical entity. As a corollary to this, we would strive for community acceptance of a single ontology for one domain, rather than encouraging rivalry between ontologies.
  6. The ontologies include textual definitions for all terms. Many biological and medical terms may be ambiguous, so terms should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader.
  7. The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
  8. The ontology is well documented.
  9. The ontology has a plurality of independent users.
  10. The ontology will be developed collaboratively with other OBO Foundry members.

ResearchBlogging.orgI’ve been asking all my frolleagues what they think of these principles and have got some lively responses, including some here from Allyson Lister, Mélanie Courtot, Michel Dumontier and Frank Gibson. So what do you think? How could these guidelines be improved? Do you have any specific (and preferably constructive) criticisms of these ambitious (and worthy) goals? Be bold, be brave and be polite. Anything controversial or “off the record” you can email it to me… I’m all ears.

CC-licensed picture above of the Old Smithy (pub) by Loop Oh. Inspired by Michael Ashburner‘s standing OBO joke (Ontolojoke) which goes something like this: Because Barry Smith is one of the leaders of OBO, should the project be called the OBO Smithy or the OBO Foundry? 🙂

References

  1. Noy, N., Shah, N., Whetzel, P., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D., Storey, M., Chute, C., & Musen, M. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse Nucleic Acids Research DOI: 10.1093/nar/gkp440
  2. Côté, R., Jones, P., Apweiler, R., & Hermjakob, H. (2006). The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries BMC Bioinformatics, 7 (1) DOI: 10.1186/1471-2105-7-97
  3. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346
  4. Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., & Rosse, C. (2005). Relations in biomedical ontologies Genome Biology, 6 (5) DOI: 10.1186/gb-2005-6-5-r46
  5. Bada, M., & Hunter, L. (2008). Identification of OBO nonalignments and its implications for OBO enrichment Bioinformatics, 24 (12), 1448-1455 DOI: 10.1093/bioinformatics/btn194

May 6, 2009

Michel Dumontier on Representing Biochemistry

Michel Dumontier by Tom HeathMichel Dumontier is visiting Manchester this week, he will be doing a seminar on Monday 11th of May,  here are some details for anyone who is interested in attending:

Title: Increasingly Accurate Representation of Biochemistry

Speaker: Michel Dumontier, dumontierlab.com

Time: 14.00, Monday 11th May 2009
Venue: Atlas 1, Kilburn Building, University of Manchester, number 39 on the Google Campus Map

Abstract: Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate manner. A fundamental starting point is biochemical identity, but our current approach for generating identifiers is haphazard and consequently integrating data is error-prone. I will discuss plausible structure-based strategies for biochemical identity whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups) such that identifiers may be generated in an automatic and curator/database independent manner. With structure-based identifiers in hand, we will be in a position to more accurately capture context-specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, our current representation of biochemical knowledge may improve such that manual and automatic methods of biocuration are substantially more accurate.

Update: Slides are now available via SlideShare.

[Creative Commons licensed picture of Michel in action at ISWC 2008 from Tom Heath]

References

  1. Michel Dumontier and Natalia Villanueva-Rosales (2009) Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics DOI:10.1093/bib/bbn056
  2. Doug Howe et al (2008) Big data: The future of biocuration Nature 455, 47-50 doi:10.1038/455047a

April 9, 2009

Upcoming Gig: The Scholarly Communication Landscape

The Scholarly Communication LandscapeDetails of an upcoming gig, The Scholarly Communication Landscape in Manchester on the 23rd of April 2009. If you are interested in coming, you need to register by Monday the 13th April at the official symposium pages.

Why? To help University staff and researchers understand some of the more complex issues embedded in the developments in digital scholarly communication, and to launch Manchester eScholar, the University of Manchester’s new Institutional Repository.

How? Information will be presented by invited speakers, and views and experience exchanged via plenary sessions.

Who For? University researchers (staff and students), research support staff, librarians, research managers, and anyone with an active interest in the field will find this symposium helpful to their developing use and provision of research digital formats. The programme for the symposium currently looks like this:

Welcome and Introduction by Jan Wilkinson, University Librarian and Director of The John Rylands Library.

Session I Chaired by Jan Wilkinson

  • Is the Knowledge Society a ‘social’ Network? Robin Hunt, CIBER, University College London
  • National Perspectives, Costs and Benefits Michael Jubb, Director, Research Information Network
  • The Economics of Scholarly Communication – how open access is changing the landscape Deborah Kahn, Acting Editorial Director Biology, BioMed Central

Session II Chaired by Dr Stella Butler

  • Information wants to be free. So … ? Dr David Booton, School of Law, University of Manchester
  • Putting Repositories in Their Place – the changing landscape of scholarly communication Bill Hubbard, SHERPA, University of Nottingham
  • The Year of Blogging Dangerously – lessons from the blogosphere, by Dr Duncan Hull (errr, thats me!), mib.ac.uk. This talk will describe how to build an institutional repository using free (or cheap) web-based and blogging tools including flickr.com, slideshare.net, citeulike.org, wordpress.com, myexperiment.org and friendfeed.com. We will discuss some strengths and limitations of these tools and what Institutional Repositories can learn from them.

Session III Chaired by Professor Simon Gaskell

Sumary and close by Professor Simon Gaskell, Vice-President for Research

March 16, 2009

February 20, 2009

Mistaken Identity: Google thinks I’m Maurice Wilkins

Who's afraid of Google?In a curious case of mistaken identity, Google seems to think I’m Maurice Wilkins. Here is how. If you Google the words DNA and mania (google.com/search?q=dna+mania) one of the first results is a tongue-in-cheek article I wrote two years ago about our obsession with Deoxyribonucleic Acid. Now Google (or more precisely Googlebot) seems to think this article is written by one M Wilkins. That’s M Wilkins as in the physicist Maurice Wilkins, the third man of the double helix (after Watson and Crick) and Nobel prize winner back in ’62. How could such a silly (but amusing) mistake be made? Because the article is about what Wilkins once said, but not actually by Wilkins. Computers can’t tell the difference between these two things. Consequently, it has been known for some time that Google Scholar has many other mistaken identities for authors like this. Scholar even thinks there is an author called Professor Forgotten Password (a prolific author who has been widely cited in many fields)!

The other curiosity is this, the original post on nodalpoint.org is also counted as a citation in Google Scholar too. It’s a bit of a mystery how scholar actually works, what it includes (and excludes) and how big it is, but you’ll find the article counted as a proper citation for a book about genes. Scientific spammers must be licking their lips with the opportunity to influence results and citation counts, with humble blog posts, rather than more kosher articles in peer-reviewed scientific journals.

So what does this all this curious interweb mischief tell us?

  1. Identifying people on the web is a tricky business, more complex than most people think
  2. Googlebot needs to have its algowithms tweaked by those Google Scholars at the Googleplex. Not really surprising, what else did you expect from Beta software? (P.S. Googlebot, when you read this, I’m not Maurice Wilkins, that’s not my name. I haven’t won a Nobel prize either.  I’m sort of flattered that you’ve mistaken me for such a distinguished scientist, so I’ll enjoy my alternative identity while it lasts.)
  3. Blogs are increasingly part of the scientific conversation, counted in various bibliometrics, will Google Scholar (and the rest) start indexing other blogs too? Where will this trend leave more conventional bibliometrics like the impact factor?

(Note: These search results were correct at the time of writing, but may change over time, results preserved for posterity on flickr)

References

  1. Maurice Wilkins (2003) The Third Man of the Double Helix: The Autobiography of Maurice Wilkins isbn:0198606656
  2. Péter Jacsó (2008) Savvy searching – Google Scholar revisited. Online Information Review 32: 102-11 DOI:10.1108/14684520810866010 (see also Defrosting the Digital Library)
  3. Douglas Kell (2008) What’s in a name? Guest, ghost and indeed quite imaginary authorships BBSRC blogs
  4. Neil R. Smalheiser and Vetle I. Torvik Author Name Disambiguation (This is a preprint version of a chapter published in Volume 43 (2009) of the Annual Review of Information Science and Technology (ARIST) (B. Cronin, Ed.) which is available from the publisher Information Today, Inc (http://books.infotoday.com/asist/#arist).
  5. Duncan Hull (2007) DNA mania. Nodalpoint.org
  6. Jules De Martino and Katie White (2008) That’s not my name (video)

November 24, 2008

Embracing Registries of Web Services

Filed under: informatics,web of science — Duncan Hull @ 2:00 pm
Tags: , , , , , , , , ,

Embracing by tanakwhoIf you travel back in time, to around 2002, it isn’t difficult to find people claiming that Web services were going to be the new silver bullet technology to create world peace, eradicate global poverty and finally make some sense of all the data produced by the human genome project. Over hyped? Just a bit. One of the many reasons none of these things happened, is it turned out to be much harder than anticipated to build centralised registries, where people could go to find Web services to perform a given task. Can service registries ever be built? Critics like Tim Bray at Sun Microsystems for example, have suggested that (quote) “registries are a fantasy”, but some already exist and there are more in the pipeline. This article briefly introduces some of them: Seekda, BioMOBY, the Embrace service registry and the Biocatalogue project. (more…)

« Previous PageNext Page »

Blog at WordPress.com.