January 5, 2007

NAR Database Issue 2007: Not Waving But Drowning?

The 14th annual Nucleic Acids Research (NAR) database issue 2007 has just been published, open-access. This year is the largest yet (again) with 968 molecular biology databases listed, 110 more than the previous one (see figure below). In the world of biological databases, are we waving or drowning?

NAR Database Growth 2007

Nine hundred and sixty eight is a lot of databases, and even that mind-boggling number is not an exhaustive or comprehensive tally. But is counting all these databases waving or drowning [1]? Will we ever stop stamp-collecting the databases and tools we have in molecular biology? What prompted this is, an employee of the The Boeing Company once told me they have given up counting their databases because there were just too many. Just think of all the databases of design and technical documentation that accompanies the myriad of different aircraft that Boeing manufacture, like the iconic 747 jumbo jet. Now, combine that with all the supply chain, customer and employee information and you can begin to imagine the data deluge that a large multi-national corporation has to handle.

Like Boeing, in Biology we’ve clearly got more data than we know what to do with [2,3]. It won’t be news to bioinformaticians and its been said many times before but its worth repeating again here:

  • We know how many databases we have but we don’t know what a lot of the data in these databases means, think of all those mystery proteins of unknown function. It will obviously take time until we understand it all…
  • Most of the data only begins to make sense when it is integrated or mashed-up with other data. However, we still don’t know how to integrate all these databases, or as Lincoln Stein puts it “so far their integration has proved problematic” [4], a bit of an understatement. Many grandiose schemes for the “integration” of biological databases have been proposed over the years, but unfortunately none have been practical to the point of implementation [5]

Despite this, it is still useful to know how many molecular biology databases there are. At least we know how many databases we are drowning in. Thankfully, unlike Boeing, most biological data, algorithms and tools are open-source and more literature is becoming open access which will hopefully make progress more rapid. But biology is more complicated than a Boeing 747, so we’ve got a long-haul flight ahead of us. OK, I’ve managed to completely overstretch that aerospace analogy now so I’ll stop there.

Whatever databases you’ll be using in 2007, have a Happy New Year mining, exploring and understanding the data they contain, not drowning in it.


  1. Stevie Smith (1957) Not waving but drowning
  2. Michael Galperin (2007) The Molecular Biology Database Collection: 2007 update Nucleic Acids Research, Vol. 35, Database issue. DOI:10.1093/nar/gkl1008
  3. Alex Bateman (2007) Editorial: What makes a good database? Nucleic Acids Research, Vol. 35, Database issue. DOI:10.1093/nar/gkl1051
  4. Lincoln Stein (2003) Biological Database Integration Nature Reviews Genetics. 4 (5), 337-45. DOI:10.1038/nrg1065
  5. Michael Ashburner (2006) Keynote at the Pacific Symposium on Biocomputing (PSB2006) in Hawaii seeAlso Aloha: Biocomputing in Hawaii
  6. This post originally published on nodalpoint with comments

December 1, 2006

NAR Web Server Issue: Walking in a Webby Wonderland

WonderlandHave you recently built a bioinformatics web application useful to the wider community that you’d like to tell the world about? Are you also looking to score brownie points for a rigourously peer-reviewed publication that stands a reasonable chance of being well cited? If that’s you, then you have one month from today (December 1st) to sort your code out, and get your abstract in, for the fifth annual Nucleic Acids Research (NAR) Web Server issue published by Oxford University Press (OUP) in 2007. All articles in this issue are published under an open access model.

As regular visitors to nodalpoint will already know, every year NAR publishes two special issues: one on databases (annually in January since 1993) and the other on web servers (annually in July since 2003). Authors interested in pre-submitting abstracts for the 2007 Web Server Issue should read the Instructions to Authors for Web Server papers in NAR and send an abstract to Gary Benson at Boston University before December 31st 2006. The deadline for final submission of full articles is January 31st 2007. Gary Benson has taken over this year from previous web server issue editor, Nobel laureate and Ignobel participant, Richard Roberts [1].

One advantage of publishing your application paper in NAR, instead of alternative open access journals like Source Code for Biology and Medicine (SCFBM), is a listing in the bioinformatics links directory [2] and a bigger impact factor [3] of 7.6, if you care about these things. There are of course, disadvantages of publishing with OUP in NAR, like the expensive open access publishing fees of $1185 to $2370 per article which are debateable value-for-money. If you’re living in a ‘List A’ developing country these charges are waived, which makes it tempting to set up a laboratory in Malawi to evade payment…

Anyway, does anyone out there know how OUP prices compare with the complicated Biomed Central membership fees which are presumably required for publication in SCFBM? Another leading open access publisher, the Public Library of Science (PLOS) currently charges from $2000 to $2500 for open access publication. Maybe I’m missing something, but aren’t these charges a lot of money to pay an administrator to shuffle a few bits of paper around and run a web server? Don’t let that put you off submitting your paper though, because in Science and academia you will either publish or perish. This is where the web is your friend because free online web availability substantially increases a paper’s impact.

On a lighter note, and now that the festive season is upon us, I’ll hand over to the Christmas crooner Perry Como to sign off:

♫ Sleigh bells ring, are you listening? In the lane, snow is glistening. A beautiful sight, We’re happy tonight, Walking in a webby wonderland. ♫

