O'Really?

May 26, 2006

BioGrids: From Tim Bray to Jim Gray (via Seymour Cray)

Filed under: biotech — Duncan Hull @ 11:30 pm
Tags: , , , , , , , , , , ,

Recycle or Globus Toolkit?Grid Computing already plays an important role in the life sciences, and will probably continue doing so for the forseeable future. BioGrid (Japan), myGrid (UK) and CoreGrid (Europe) are just three current examples, there are many more Grid and Super Duper Computer projects in the life sciences. So, is there an accessible Hitch Hikers Guide to the Grid for newbies, especially bioinformaticians?

Unfortunately much of the literature of Grid Computing is esoteric and inaccessible, liberally sprinkled with abstract and wooly concepts like “Virtual Organisations” with a large side-order of acronym soup. This makes it difficult or impossible for the everyday bioinformatican to understand or care about. Thankfully, Tim Bray from Sun Microsystems has a written an accessible review of the area, “Grids for dummies”, if you like. Its worth a read if you’re a bioinformatician with a need for more heavyweight distributed computing than the web currently provides, but you find Grid-speak is usually impenetrable nonsense.

One of the things Tims discusses in his review is Microsoftie Jim Gray, who is partly responsible for the 2020 computing initiative mentioned on nodalpoint earlier. Tim describes Jim’s article Distributed Computing Economics. In this, Jim uses wide variety of examples to illustrate the current economics of grids, from “Megaservices” like Google, Yahoo! and Hotmail to the bioinformaticians favourites, BLAST and FASTA. So how might Grids affect the average bioinformatician? There are many different applications of Grid computing, but two areas spring to mind:

  1. Running your in silico experiments (genome annotation, sequence analysis, protein interactions etc), using someone elses memory, disk space, processors on the Grid. This could mean you will be able to do your experiments more quickly and reliably than you can using the plain ol’ Web.
  2. Executing high-throughput and long-running experiments, e.g. you’ve got a ton of microarray data and it takes hours or possibly days to analyse computationally.

So if you deal with microarray data daily, you probably know all this stuff already, but Tims overview and Jims commentary are both accessible pieces to pass on to your colleagues in the lab. If this kind of stuff pushes your button, you might also be interested in the eProtein Scientific Meeting and Workshop Proceedings.

[This post was originally published on nodalpoint with comments.]

Blog at WordPress.com.