O'Really?

January 22, 2007

DNA mania

Filed under: bio — Duncan Hull @ 10:29 pm
Tags: bioinformatics, dna, DNA mania, protein, rna

What does DNA do when it’s not being transcribed into RNA? It causes DNA mania…

“DNA, you know, is Midas’ gold. Everyone who touches it goes mad.”

Read the rest in [1,2]

Do you or your colleagues ever suffer from DNA mania [3,4]? A biochemist friend of mine once semi-jokingly remarked that people’s manic obsession with DNA is a bit like buying some food and being more interested in the bar-code on the packaging, than the food inside. In his particular area of research, DNA is about as exciting as bar-codes, because it doesn’t even leave the nucleus of the cell, at least in Eukaryotes. I wonder what readers of nodalpoint think of this analogy? Anyway, as a result of this philosophy, most of his community have developed an unhealthy and manic interest in proteins rather than DNA. You could call this particular obsessive-compulsive disorder “protein mania”.

Depending on the scientific obsession(s) of your particular community, you might need to substitute Protein or RNA for DNA in the above quote, as appropriate. And if that is all too molecular for you, substitute any other of your favourite bioinformatics buzzwords.

References

Horace Freeland Judson (1996) The Eighth Day of Creation: Makers of the Revolution in Biology
John Sulston (2006) Won for All: How the Drosophila Genome was sequenced: a book by Michael Ashburner
André Pichot (1999) Histoire de la notion de gène (one of the first documented uses of the phrase “DNA mania”)
Denis Noble (2006) The Music of Life: Biology Beyond the Genome (an antidote to DNA mania and the Dawkinian gene-centric view of Life)
DNA Photograph taken by Unapersona in Ciutat de les Arts i les Ciències, Calatrava building, Valencia, Spain.

Leave a Comment

January 5, 2007

NAR Database Issue 2007: Not Waving But Drowning?

Filed under: Uncategorized — Duncan Hull @ 10:43 pm
Tags: bioinformatics, data tombs, database, Lincoln Stein, Michael Galperin, NAR, Not waving but drowning, Open Access, OUP, Stevie Smith

The 14th annual Nucleic Acids Research (NAR) database issue 2007 has just been published, open-access. This year is the largest yet (again) with 968 molecular biology databases listed, 110 more than the previous one (see figure below). In the world of biological databases, are we waving or drowning?

Nine hundred and sixty eight is a lot of databases, and even that mind-boggling number is not an exhaustive or comprehensive tally. But is counting all these databases waving or drowning [1]? Will we ever stop stamp-collecting the databases and tools we have in molecular biology? What prompted this is, an employee of the The Boeing Company once told me they have given up counting their databases because there were just too many. Just think of all the databases of design and technical documentation that accompanies the myriad of different aircraft that Boeing manufacture, like the iconic 747 jumbo jet. Now, combine that with all the supply chain, customer and employee information and you can begin to imagine the data deluge that a large multi-national corporation has to handle.

Like Boeing, in Biology we’ve clearly got more data than we know what to do with [2,3]. It won’t be news to bioinformaticians and its been said many times before but its worth repeating again here:

We know how many databases we have but we don’t know what a lot of the data in these databases means, think of all those mystery proteins of unknown function. It will obviously take time until we understand it all…
Most of the data only begins to make sense when it is integrated or mashed-up with other data. However, we still don’t know how to integrate all these databases, or as Lincoln Stein puts it “so far their integration has proved problematic” [4], a bit of an understatement. Many grandiose schemes for the “integration” of biological databases have been proposed over the years, but unfortunately none have been practical to the point of implementation [5]

Despite this, it is still useful to know how many molecular biology databases there are. At least we know how many databases we are drowning in. Thankfully, unlike Boeing, most biological data, algorithms and tools are open-source and more literature is becoming open access which will hopefully make progress more rapid. But biology is more complicated than a Boeing 747, so we’ve got a long-haul flight ahead of us. OK, I’ve managed to completely overstretch that aerospace analogy now so I’ll stop there.

Whatever databases you’ll be using in 2007, have a Happy New Year mining, exploring and understanding the data they contain, not drowning in it.

References

Stevie Smith (1957) Not waving but drowning
Michael Galperin (2007) The Molecular Biology Database Collection: 2007 update Nucleic Acids Research, Vol. 35, Database issue. DOI:10.1093/nar/gkl1008
Alex Bateman (2007) Editorial: What makes a good database? Nucleic Acids Research, Vol. 35, Database issue. DOI:10.1093/nar/gkl1051
Lincoln Stein (2003) Biological Database Integration Nature Reviews Genetics. 4 (5), 337-45. DOI:10.1038/nrg1065
Michael Ashburner (2006) Keynote at the Pacific Symposium on Biocomputing (PSB2006) in Hawaii seeAlso Aloha: Biocomputing in Hawaii
This post originally published on nodalpoint with comments

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Comments (1)

December 19, 2006

Taverna 1.5.0

Filed under: Uncategorized — Duncan Hull @ 8:26 pm
Tags: bioinformatics, biomart, feta, myGrid, semantic web, taverna, workflow

Happy Christmas from the myGrid team, who are pleased to announce the release of version 1.5.0 of the Open Source Taverna bioinformatics workflow toolkit [1]. This is now available for download on the Sourceforge site and includes some substantial changes to version 1.4.

Taverna 1.5.0 is a small download, but when first run it will then download and install the required packages which can take some time on slow networks. In the near future there will be a mechanism for downloading a bundle of core packages. There are some significant changes in the underlying architecture of Taverna and how it handles core packages and optional plugins, using a system called Raven, see release notes below.

The documentation is currently being updated and the user documentation should be complete very soon, with the technical documentation following shortly afterwards. The reason for this is to allow the software to be released with some time to spare before the Christmas holidays.

Release notes:

There have been a number of substantial changes in the underlying architecture of Taverna since the previous release. These include:

An overhaul of the User Interface (UI), replacing the unpopular Multiple Document Interface with a cleaner and simpler single document UI which can be customised using Perspectives. There are built in perspectives to allow the design and enactment of workflows, and plugins can integrate with the UI by providing perspectives of their own. Together with this, users are able to create their own layouts built from individual components.
Taverna now allows for multiple workflows to be open and enacted at the same time.
Support for the new BioMart data management system version 0.5, together with backward compatibility for old workflows that used Biomart 0.4.
Better provenance generation and browsing support, through a plugin now known as LogBook.
Better support for semantic service discovery through the Feta plugin [2].
Modulularisation of the Taverna code base.
Development and integration of an underlying architecture know as Raven. This allows for Apache Maven like declaration of dependencies which are discovered and incorporated into the Taverna system at runtime. Together with the modularisation of the Taverna code base, Raven gives the benefit that updates can be provided dynamically and incrementally, without the need for monolithic releases as in the past. This allows the provision of updates to bugs, and new features, within a very short timescale if necessary. It also provides plugin developers with a greater degree of autonomy and independance from the core Taverna code base.
Improved and more advanced plugin management with the ability to provide immediate updates, and for plugin providers to publish their plugins via xml descriptions.
Numerous bug-fixes including the removal of a number of memory leaks.

JIRA generated release notes and bug status reports can be found here and here

References

Leave a Comment

December 12, 2006

Semantic Web for Life Sciences Book

Filed under: semweb — Duncan Hull @ 4:57 pm
Tags: public-semweb-lifesci, w3c

All I want for Christmas is a book about the semantic web, written by people who are actually building and using it, rather than “visionaries” who don’t have to. Maybe this year I’ll be lucky…

A group of semantic webheads (aka HCLSIG the Health Care and Life Sciences Interest Group) led by Christopher J. Baker and Kei-Hoi Cheung and gathered together on public-semweb-lifesci@w3.org have written a book about the semantic web for life sciences.

I haven’t seen the final printed version of this book yet, but if you want to add it to your christmas amazon wishlist, its called Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences (ISBN:0387484361). The table of contents for the book (DOI:10.1007/978-0-387-48438-9) has more details if you are interested.

So what about other readers, what bioinformatics presents (not just books) would they like to find under the Christmas tree this year? If you don’t celebrate Christmas, what Solstice wishes do you have?

(see original post at nodalpoint for comments)

Leave a Comment

Buggotea: Redundant Links in Connotea

Filed under: informatics — Duncan Hull @ 10:09 am
Tags: bibliome, bioinformatics, BioPERL, bug, buggotea, citeulike, connotea, database, Ian Mulvany, Jason Stajich, normalization, PERL, redundancy

Dear Santa, all I want for Christmas* is a better version of Connotea, please can you sort out it’s duplicated redundant links? In my book this particular bug is “buggotea” number one. Here is the problem… [update: buggotea is partially fixed, see comments from Ian Mulvany at the nodalpoint link in the references below]

There is this handy bioinformatics web application called Connotea which I like to use, built by those nice people in the web team at Nature Publishing Group. Most readers of nodalpoint probably already know about it, but because you’re Santa and you’ve been busy lately, let me explain. Connotea can help scientists (not just bioinformaticians) to organise and share their bibliographic references, whilst discovering what other people with similar interests are reading. It’s good, but it has some bugs in it. Since it’s open-source software, anyone with the time, inclination and skills can get hold of the connotea source code and improve it. There is, however, one particularly nasty redundancy bug in Connotea that is bugging me [1]. I think it should be fixable, and that doing so would make Connotea a significantly better application than it already is. Let’s illustrate this bug with a little story…

(more…)

Leave a Comment

December 1, 2006

NAR Web Server Issue: Walking in a Webby Wonderland

Filed under: Uncategorized — Duncan Hull @ 3:18 pm
Tags: bioinformatics, data tombs, Gary Benson, NAR, OUP, publish or perish, web, Wonderland

Have you recently built a bioinformatics web application useful to the wider community that you’d like to tell the world about? Are you also looking to score brownie points for a rigourously peer-reviewed publication that stands a reasonable chance of being well cited? If that’s you, then you have one month from today (December 1st) to sort your code out, and get your abstract in, for the fifth annual Nucleic Acids Research (NAR) Web Server issue published by Oxford University Press (OUP) in 2007. All articles in this issue are published under an open access model.

As regular visitors to nodalpoint will already know, every year NAR publishes two special issues: one on databases (annually in January since 1993) and the other on web servers (annually in July since 2003). Authors interested in pre-submitting abstracts for the 2007 Web Server Issue should read the Instructions to Authors for Web Server papers in NAR and send an abstract to Gary Benson at Boston University before December 31st 2006. The deadline for final submission of full articles is January 31st 2007. Gary Benson has taken over this year from previous web server issue editor, Nobel laureate and Ignobel participant, Richard Roberts [1].

One advantage of publishing your application paper in NAR, instead of alternative open access journals like Source Code for Biology and Medicine (SCFBM), is a listing in the bioinformatics links directory [2] and a bigger impact factor [3] of 7.6, if you care about these things. There are of course, disadvantages of publishing with OUP in NAR, like the expensive open access publishing fees of $1185 to $2370 per article which are debateable value-for-money. If you’re living in a ‘List A’ developing country these charges are waived, which makes it tempting to set up a laboratory in Malawi to evade payment…

Anyway, does anyone out there know how OUP prices compare with the complicated Biomed Central membership fees which are presumably required for publication in SCFBM? Another leading open access publisher, the Public Library of Science (PLOS) currently charges from $2000 to $2500 for open access publication. Maybe I’m missing something, but aren’t these charges a lot of money to pay an administrator to shuffle a few bits of paper around and run a web server? Don’t let that put you off submitting your paper though, because in Science and academia you will either publish or perish. This is where the web is your friend because free online web availability substantially increases a paper’s impact.

On a lighter note, and now that the festive season is upon us, I’ll hand over to the Christmas crooner Perry Como to sign off:

♫ Sleigh bells ring, are you listening? In the lane, snow is glistening. A beautiful sight, We’re happy tonight, Walking in a webby wonderland. ♫

References

This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Leave a Comment

November 28, 2006

Postdoc Hell: Should I Stay Or Should I Go?

Filed under: bio — Duncan Hull @ 9:50 pm
Tags: careers, hell, Iddo Friedberg, Joe Strummer, nodalpoint, phd, Phillip Bourne, PostDoc, postdoc-hell, PostDoctoral, Ten Simple Rules, The Clash

Sometimes, being a PostDoctoral researcher is a tough life. Thankfully, help is at hand in Philip Bourne and Iddo Friedberg‘s guide Ten Simple Rules for Selecting a Post-Doctoral Position published in PLOS Computational Biology. This article is part of a series of editorials [1,2,3] which discuss various aspects of the weird and wonderful world of scientific research. They are worth reading if you’re at an early stage of your career, although you may not always agree with all the advice given. For example, the article advises PostDocs to:

Think very carefully before extending your graduate work into a postdoc in the same laboratory where you are now – to some professionals this raises a red flag when they look at your resume. Almost never does it maximise your gain of knowledge and experience, but that can be offset by rapid and important publications.

Do any experienced postdocs (or post-postdocs) out there have an opinions on the importance of moving labs after a PhD? What if you’re already in a great lab and like where you work? To what extent is it important to move, just to get new experience and skills? Or as The Clash once put it [4]:

♫ If I go there will be trouble, if I stay it will be double.
So come on and let me know, should I cool it or should I blow? ♫

References

Phillip Bourne (2006) Ten Simple Rules for Getting Published PLOS Computational Biology
Phillip Bourne and Leo Chalupa (2006) Ten Simple Rules for Getting Grants PLOS Computational Biology
Phillip Bourne and Alon Korngreen (2006) Ten Simple Rules for Reviewers PLOS Computational Biology
Joe Strummer and Mick Jones (1981) Should I stay or should I go?
Jawahar Swaminathan (2006) A ten step plan for PostDoc training nodalpoint.org
this post originally on nodalpoint with comments
Postdoc Hell, a collection of articles describing the plight of the postdoctoral researcher on citeulike

This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

update: Mysteriously, Nature jobs used the Clash as a theme to their careers supplement, two weeks after this post was published. See How to ask yourself questions about major career decisions and Should I Stay Or Should I Go?. Coincidence? I wonder if they read nodalpoint?

Leave a Comment

New, Improved SEMANTIC Web: Now with added meaning

Filed under: funny — Duncan Hull @ 5:59 pm
Tags: Eric Schmidt, Mark Butler, meaning, Rachel Murphy, semantic web

This amusing picture-parody of the semantic web is worth a thousand words, was conceived of by Mark Butler for a presentation [1] and drawn by Rachel Murphy of Rude Girl Designs.

References

Mark Butler (2003) Is the semantic web hype? Hewlett Packard laboratories presentation at MMU, 2003-03-12
Tim Berners-Lee (2006) Welcome to the Semantic Web The Economist: The World in 2007
Eric Schmidt (2006) Why the web will win by Eric Schmidt, CEO of Google The Economist: The World in 2007
The Romantic Web: Peter Norvig of Google vs Tim Berners-Lee of the Dubya-3-C
Burn semantic web, burn!

Leave a Comment

November 7, 2006

People 2.0: Pioneers of the next generation Web

Filed under: informatics,web — Duncan Hull @ 10:06 pm
Tags: Ewan Birney, Grauniad, Guardian, Ian Holmes, Jimmy Wales, Matt Mullenweg, technorati, Web 2.0, wikipedia, wordpress

UK news-rag The Grauniad has a series of interviews with some of the people behind the next generation web, so-called Web 2.0. After reading these interviews, I can’t help wondering, who are the equivalent pioneers in bioinformatics?

The interviews include…

…and several others too. Most of the interviews are worth reading, I particularly enjoyed Mullenweg’s which contains a wonderful quote:

Q: What is your big idea?

A: I don’t have big ideas. I sometimes have small ideas, which seem to work out.

So who is currently pioneering the “Web of Science”, Bioinformatics 2.0 if you like? Ensemblian Ewan Birney? Ian Holmes at Berkeley? Or somebody else?

[Image credit: Picture from Steve Jurvetson, this post originally published on nodalpoint with comments]

Leave a Comment

November 1, 2006

Bioinformatics Impact Factors

Filed under: web — Duncan Hull @ 10:14 pm
Tags: bibliometrics, bioinformatics, H-index, impact factor, JCR, Journal Citation Reports, Lincoln Stein, nodalpoint

There are all sorts of flaws with using impact factors for judging the quality of biomedical research. Love them or hate them, just getting hold of impact factors for journals in bioinformatics and related fields is much harder than it should be, so I thought I’d reproduce some statistics I gathered here. The rankings, which you should use with caution [1,2], are correct as of June 2006 (and apply to citations in 2005) courtesy of Journal Citation Reports®, part of Thomson ISI Web of Knowledge. JCR has a pretty horrible clunky web interface when compared to some of its rivals [3,4], maybe one day they’ll make it better. Anyway, this is not a comprehensive list, just a fairly random selection of bioinformatics and computer science journals that publish articles I’ve been reading the last few years.

Journal	ISI impact factor
Science	30.927
Cell	29.431
Nature Reviews Molecular Cell Biology	29.852
Nature	29.273
Nature Genetics	25.797
Nature Biotechnology	22.378
Nature Reviews Drug Discovery	18.775
PLOS Biology	14.672
PNAS	10.231
Genome Research	10.139
Genome Biology	9.712
Drug Discovery Today	7.755
Nucleic Acids Research	7.552
Bioessays	6.787
Plant Physiology	6.114
Bioinformatics (OUP)	6.019
BMC Bioinformatics	4.958
BMC Genomics	4.092
Proteins: structure, function and bioinformatics	4.684
IEEE Intelligent Systems	2.560
Journal of Computational Biology	2.446
Journal of Biomedical Informatics	2.388
IEEE Internet Computing	2.304
Artificial Intelligence in Medicine	1.882
Comparative and Functional Genomics	0.992
Concurrency and Computation: Practice and experience	0.535
Briefings in Bioinformatics (OUP)	not listed
PLOS Computational Biology	not listed
Journal of Web Semantics	not listed

One point of interest, cheeky young upstart BioMed Central Bioinformatics (going since 2000) seems to be catching up on traditional old-school favourite OUP Bioinformatics (going since 1985), which as mentioned on nodalpoint, has been publishing some dodgy parser papers lately.

References

Plos Medicine Editors (2006) The Impact Factor Game: It is time to find a better way to assess the scientific literature PLoS Medicine 3 (6), 291 (6 Jun 2006)
Anon (2005) Not-so-deep impact. Research assessment rests too heavily on the inflated status of the impact factor. Nature. 435 (7045), 1003-4 (23 Jun 2005)
Jim Giles (2005) Comparison of Google Scholar, Thomson ISI Web of Science and Scopus Citation Database from Elsevier Nature. 438 (7068), 554-5 (01 Dec 2005)
Maksim V Plikus, Zina Zhang and Cheng-Ming Chuong (2006) PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm BMC bioinformatics 7:424
Neil Saunders (2005) Impact factors discussion on nodalpoint
This post originally published on nodalpoint with comments

Leave a Comment

« Previous Page — Next Page »

January 22, 2007

References

January 5, 2007

References

December 19, 2006

Release notes:

References

December 12, 2006

December 1, 2006

References

November 28, 2006

References

References

November 7, 2006

November 1, 2006

References

Meta / μετά