O'Really?

September 5, 2007

WWW2007: Workflows on the Web

Don't PanicThe Hitch-hiking novelist Douglas Noel Adams (DNA) once remarked that the World Wide Web (WWW) is the only thing whose shortened form – ‘double-you double-you double-you-dot’ – takes three times longer to say than what it’s “short” for [1]. If he were still with us today, there is plenty of stuff at the 16th International World Wide Web conference (WWW2007), currently underway in Banff, that would interest him. Here are some short, abbreviated notes on a couple of interesting papers at this years conference. They are relevant to bioinformatics and worth reading, whichever type of DNA you’re most interested in.

One full paper [2] by Daniel Goodman describes a scientific workflow language called Martlet. The motivating example is taken from climateprediction.net but I suspect some of the points they make about scientific workflows are relevant to bioinformatics too. Just like the recent post by Boscoh about functional programming, the paper discusses an inspired-by-Haskell functional approach to building and running workflows. Comparisons with other workflow systems like Taverna / SCUFL are drawn. Despite what they say, Taverna already uses a functional model (not an imperative one), it just hasn’t been published yet. The paper also draws comparisons between Martlet and other functional systems, like Google’s Map-Reduce. It concludes that the (allegedly) new Martlet programming model “raises the interesting possibility of a whole set of new algorithms just waiting to be discovered once people start to think about programming in this new way”. Which is an exciting possibility.

Another position paper [3] (warning: position paper = arm waving) by Anupriya Ankolekar et al argues that the Semantic Web and Web-Two-Point-Oh are complementary, rather than competing. Their motivating examples are a bit lame (Blogging a movie? Can’t they think of something more original?) …but they make some interesting (and obvious) points. The authors think that aggregators like Yahoo! Pipes! will play an important role in the emerging Semantic Web. Currently, there don’t seem to be too many bioinformaticians using Yahoo! pipes, perhaps they just don’t share their pipes / workflows yet?

Running in parallel to all of the above is the Health Care and Life Sciences Data Integration for the Semantic Web workshop, where more detailed discussion on the bio semweb is underway. As its a workshop, there are no full or position papers, but take a look at The State of the Nation in Life Science Data integration to get a flavour of what is going on.

Wether functional, semantic, Web-enabled or just buzzword-friendly, there is plenty of action in the scientific workflow field right now. If you’re interested in the webby stuff, next years conference, WWW2008, is in Beijing, China. I wonder if they will mark the 10th anniversary of the publication of that Google paper at WWW7 back in 1998? The deadline for papers at WWW2008 will probably be sometime in November 2007, but around 90% of submitted papers will be rejected if previous years are anything to go by. If you’re thinking of doing a paper, DON’T PANIC about those intimidating statistics, because bioinformatics is bursting full of interesting and hard problems that challenge the state-of-the-art. The kind of stuff that will go down well at Dubya Dubya Dubya.

(Photo credit: Fire Monkey Fish)

References

  1. Douglas Adams (1999) Beyond the Brochure: Build it and we will come
  2. Daniel Goodman (2007) Introduction and Evaluation of Marlet, a Scientific Workflow Language for Abstracted Parallelisation doi:10.1145/1242572.1242705
  3. Anupriya Ankolekar, Markus Krotzsch, Thanh Tran and Denny Vrandecic (2007) The Two Cultures: Mashing up Web 2.0 and the Semantic Web doi:10.1145/1242572.1242684


Creative Commons License

This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


Semantic Biomedical Mashups with Connotea


Mashup or Shutup

The Journal of Biomedical Informatics (JBI), will soon be publishing their special issue on Semantic Biomedical Mashups (can you fit any more buzzwords into a Call For Papers?!). Ben Good and friends have submitted a paper on their Entity Describer which extends connotea using some Semantic Web goodness. They’d appreciate your comments on their submitted manuscript over at i9606. As Ben says, their pre-publication turns out to be an interesting experiment “figuring out how blogging might fit into the academic publishing landscape”. If this interests you, get commenting now!

Update: Just spotted this interesting graphic of the Elsevier / Evilsevier logo (snigger), who are the publishers of JBI…

August 7, 2007

Scifoo: Geek Out! Le Geek, C’est Chic…

Deepak Singh and Euan Adie

As well as big famous superstars at Science Foo Camp (scifoo), there is a chance to meet and “geek out” with younger engineers and scientists like Vince Smith, Aaron Schwartz and Vaughan Bell.

Aaron Schwartz and the open library project

On Sunday at scifoo, Aaron (of archive.org) gave a quick demo of the Open Library. Currently this project is taking books that are out of print and not in other book catalogues like Amazon, and making them available online. They are intending to move into archiving scientific journals, so watch that space. I’ve always wondered how the internet archive survived financially, and managed all its interesting projects (like the open library). It’s all funded by some bloke called Brewster Kahle. They provide some great services, like hosting digital artifacts for free, see http://www.archive.org/create/.

Vince Smith, Museums and Drupal

Vince Smith is a “cyber-taxonomist” at the Natural History Museum in London. He’s a world expert on parasitic lice, and uses a multi-site installation of Drupal, see vsmith.info (Hmmm, that drupal skin looks familiar…). Vince uses a drupal module for bibliographic citations, called biblio, looks handy. It’d be nice to have it on nodalpoint? Anyway, anytime spent looking around Vince’s site is time well spent.

Vaughan Bell, Mind Hacker

Vaughan Bell is a clinical psychologist. We chatted about wikipedia and science, as demonstrated by Schizophrenia. He’s also a contributor to a book on MindHacks and blogs at mindhacks.com. My suitcase is full of free O’Reilly book-schwag I filled my boots with on Friday, one of which is Vaughan’s book. Looks like it will be a good read on the plane home, because my brain is in need of some serious “optimisation”.

(Two more geeks, pictured right, but regular nodalpoint readers will know all about them already, Deepak Singh and Euan Adie.)

Theres plenty more I could blog about scifoo, but I’m all foo-ked up, geeked out and mashed-up. It’s time to go home. For more scifoo blogging see www.technorati.com/tags/scifoo, www.nature.com/scifoo and network.nature.com/blogs/tag/scifoo.

References

  1. Aaaaah: Freak Out! Le Freak, C’est Chic…

Creative Commons License

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


August 6, 2007

Scifoo day three: Genome Voyeurism with Lincoln Stein

On day three of Science Foo Camp (scifoo) biologist Lincoln Stein (picture right) gave a presenation on what he calls “genome voyeurism”, using Jim Watsons genome as an example. This session demonsrated the current and future possibilities of individuals having their own DNA sequenced, what has been called “personal genomics“.

Unlike the session on genomics yesterday on day two, where George Church, Eric Lander, 23andme, Sergey and Larry (and even Sergey’s pet dog) are all present, today they are conspicuously absent.

Lincolns presentation starts with a video (see youtube video below) of Jim Watson receiving his genome on a disk from Baylor College of Medicine, Houston. Lincoln tells how Jim puts his genome (stored on a hard drive) next to his Nobel prize medallion in his office. After all the press publicity, Jim deposits the data in GenBank, and it becomes available worldwide. (more…)

Scifoo day two: Good Morning Mashup

Vince Smith, Brian Berman, Paul Ginsparg, Linda Miller, John SantiniSome of the most interesting conversations you have at Science Foo Camp (scifoo) are in the corridors, foo bars and even the bus that shuttles between the Googleplex and the hotel…On Saturday, for example, I ride the bus with David Hawkins who is a laywer working in the area of climate change. He tells me all about the legal issues, how climate modelling works and little on Bjørn Lomborg, who is also here. I tell him about workflows on the web and bioinformatics. We work in completely different areas, and we’d never normally meet. But in a short conversation, we manage to learn a little from each other and find connections. The problems that climateprediction.net face, turn out to be quite similar to the problems that genomics faces in integrating data on the web. When we arrive at the Googleplex, it’s time for Open Science… (more…)

August 4, 2007

Scifoo day 1: Turn up, tune in, drop out

Filed under: google — Duncan Hull @ 9:38 pm
Tags: , , , , , ,

Scifoo campersMy boss, Douglas Kell, who has kindly allowed and paid for me to attend Science Foo Camp (scifoo), says to me “tell me what you get up to”. So here goes. Scifoo day 1, A chance to meet and around 250 engineers, scientists, philosophers and other odd people from all over the world.

Shortly after arriving at the Googleplex, California and being fed by gourmet chefs, it all starts . There is a quick round of introductions from everyone in the room, the conference schedule gets put up on a big board, and interactively edited like a wiki. Sounds chaotic, but it actually works.

The introductions are followed by some lightning talks by selected people, chaired by Tim O’Reilly and Timo Hannay.

  1. Drew Endy from OpenWetWare talked about biotechnology. He drew analogies between civil engineering and bio-engineering. Today we can build wonderful bridges like Viaduc Millau in France. But it hasn’t always been that way. In the stone age, we used rocks as they were to build the likes of Stone Henge. Then we moved to to quarrying rock more systematically, so we can build simple bridges. For biotechnology to succeed in the same way as civil engineering, we need to synthesize DNA in the same way as we synthesis concrete to make bridges. But currently, biotechnology is still in its stone age.
  2. Charles Simonyi gave a talk about his recent trip as a Space tourist. I’ve never met an astronaut before, and never wondered what it smells like or what the quality of your sleep is like in space. You can find out more about Charles in Space</.
  3. Felice Frankel: Visualisation, visualisation, visualisation! (although she doesn’t like that word)

After all this, theres some time for “corridor conversations” with other delegates, which is where most of the interesting stuff goes on. Its difficult to pull out a narrative, because theres all kinds of people here: some people I managed to speak to (note form, sorry!):

In his introduction, Tim O’Reilly described scifoo as “making new synapses in the global brain”. You take a load of people from different disciplines, stick them together, and they find all sorts of interesting connections that they might not otherwise have found. It might sound pretentious, but I think its true. Unlike larger conferences, scifoo is small and intimate enough to be able to talk to lots of different people which is one thing that makes it special. This year, they’ve lifted the blogging ban, so everything is public unless stated otherwise. Which means you’ll be hearing lots more about it from bloggers like me at the conference.

Day two will be fun, theres lots of demos, and more people to meet: Martin Rees, how do we survive the twenty first century given that we’re all going to die?…Must try and pluck up the courage to talk to Sergey but I’m completely starstruck. Brian Cox, Hello, I’ve seen you on the telly…Esther “always make new mistakes” Dyson, Anne Wojcicki, George Church, Eric Lander, Paul Z. Myers Theres a tonne of bio-people here….So many people, so little time!

[this post originally published on nodalpoint]

Creative Commons License

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

May 31, 2007

Google Metabolic Maps

Google in the Palm of my HandThese days, new Google products and code seem to appear on a weekly basis. Take, for example, Google Gears which takes advantage of SQLite, mentioned on nodalpoint recently. They certainly don’t hang about at the Googleplex in Mountain View, California. Wouldn’t it be great if Google applied some of that engineering expertise and agility to science and bioinformatics? Just imagine: we could have Google Metabolic Maps, a virtual globe of the cell for scientists everywhere…

Scientists have been drawing metabolic maps for a very long time, but unfortunately when it comes to charting and understanding metabolic pathways, we’re still at the “here be dragons” stage of bio-cartography. I’m obviously not the first person to dream of this, but imagine maps of metabolic pathways looked more like Google Earth or Google Maps, than the old fashioned style maps, many life scientists will be familiar with. Now imagine just a little more, that these maps weren’t just available on conventional screens, but we’re given the Minority Report treatment, courtesy of Mr Bill Gates and his wizzy surface magic at Microsoft. Wouldn’t that be great? Metabolic maps on an interactive tabletop computer. Just like Tom Cruise in the movies, we’d be able to effortlessly swish around metabolism (or the metabolome / proteome / genome / [insert-your-favourite]ome). Imagine if it was all open-source too, no boundaries, no passports…

Now, you may say that I’m a dreamer, but I’m not the only one [1,2,3].

References

  1. Zhenjun Hu, Joe Mellor, Jie Wu, Minoru Kanehisa, Joshua M. Stuart and Charles DeLisi (2007) Towards zoomable multidimensional maps of the cell Nature biotechnology 25 (5), 547-54. DOI:10.1038/nbt1304
  2. Hiroaki Kitano, Akira Funahashi, Yukiko Matuoka and Kanae Oda (2005) Using process diagrams for the graphical representation of biological networks Nature biotechnology 23 (8), 961-6. DOI:10.1038/nbt1111
  3. John Lennon and Yoko Ono (1971) Imagine
  4. this post originally published on nodalpoint with comments

Creative Commons License

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


April 13, 2007

Collaboration, collaboration, collaboration!

Geldof Blair collaborationWhat should your three main priorities be as a Scientist? Collaboration, collaboration, collaboration. Quentin Vicens and Phil Bourne have just published Ten Simple Rules for a Successful Collaboration [1] to help you do just that, as part of a continuing series [2,3,4,5].

Tony Bliar once said “Ask me my three main priorities for government, and I tell you: education, education, education.” In Science, its not so much about education as collaboration, collaboration, collaboration. The advice in Ten Simple Rules is all useful stuff, but what caught my eye is the fact that collaboration is on the rise, at least according to the number of co-authors on papers published in PNAS. The average number of co-authors has risen from 3.9 in 1981 to 8.4 in 2001. So before you publish or perish, it seems likely that you’ll also need to collaborate or commiserate… less laboratory, more collaboratory!

Photo credit Garret Keogh

References

  1. Quentin Vicens and Phillip Bourne (2007) Ten Simple Rules for a Successful Collaboration PLOS Computational Biology
  2. Phillip Bourne (2006) Ten Simple Rules for Getting Published PLOS Computational Biology
  3. Philip Bourne and Iddo Friedberg (2006) Ten Simple Rules for Selecting a Postdoctoral Position PLOS Computational Biology
  4. Phillip Bourne and Leo Chalupa (2006) Ten Simple Rules for Getting Grants PLOS Computational Biology
  5. Phillip Bourne and Alon Korngreen (2006) Ten Simple Rules for Reviewers PLOS Computational Biology
  6. This post originally published on nodalpoint with comments

Creative Commons License

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

March 30, 2007

This month’s molecule is…

Filed under: biotech — Duncan Hull @ 10:10 pm
Tags: , , , , , ,

Space-filling and backbone model of 1HRYThere are a number of “Molecule of the Month” style mini-reviews on the web, which highlight one particular molecule (usually a protein) every month, in an accessible style. Two of my personal favourites are protein spotlight: one month, one protein written by Vivienne Baillie Gerritsen of the Swiss-Prot team and Molecule of the Month at the Protein Databank PDB edited by David Goodsell. Both these features are worth a quick read because they can help bio-literate and bio-curious users to increase and reinforce their knowledge relatively quickly.

Part of what makes the PDB one worth reading is the colourful visualisations and short descriptions that go with it. For March 2007, PDBs molecule of the month is Zinc Fingers. Meanwhile, over at swissprot, the molecule is Sex-determining region Y protein (Sry), used to illustrate the tenuous nature of sex.

[This post originally published on nodalpoint with comments]

February 22, 2007

NSPNAS: Nature, Science or PNAS?

Filed under: publishing,Uncategorized — Duncan Hull @ 10:19 pm
Tags: , ,

A crude score for benchmarking scientists

TIM Have you ever wanted to compare different scientists by their publication record? It’s not always an easy task, but here is a crude and handy way to benchmark people by their journal publications in Nature, Science or PNAS using PubMed. Let’s call it the NSPNAS score, it’s not the h-index and it’s far from perfect, but it can be useful.

Imagine these scenarios:

  1. You’re a young scientist comtemplating who to do an undergraduate project, Masters degree or PhD with.
  2. You’ve finished your PhD and are wondering which lab could be your Stairway to PostDoc Heaven [1].
  3. You’re lucky enough to have landed a faculty position and you want to check the credibility of your new colleagues.
  4. You want to do some industrial espionage on your competitors in different labs around the world.
  5. You’re a Scientist dammit, and naturally you’re a curious person who just likes to measure things.

In any of these situations, you’ll probably want to look up the people concerned using Google Scholar which will give you a good idea of their research history. But you’re not interested in publications in the Journal of Few Subscribers or the Proceedings of the Boring Incomprehensible Nonsense Society (BINS), even if Google Scholar lists hundreds of their citations. Instead, you care about counting the Big Bang impact publications they have in the über-journals: Nature, Science and PNAS. You can find these publications in PubMed with this simple query:

Surname +Initials[au]+(nature[journal] or science[journal] or Proc Natl Acad Sci U S A[journal])

…and you can obviously modify this query to include popular journals from your own field as appropriate.

Where NSPNAS works

Note, NSPNAS scores were correct at the time of writring in 2007, but will change over time.

When you substitute an authors name and initials into the beginning of that query, you get your NSPNAS score. So Systems Biologist Douglas Kell for example, surname and initials “Kell+D[au]”, has an NSPNAS score of 6.

If the person in question has a unique or unusual surname and initials, its fairly easy to find their score: Nodalpointer Chris Mungall has an NSPNAS score of two while nodalpointer Jason Stajich has an NSPNAS score of three. These results suggest a positive correlation between Californian sunshine and NSPNAS. Meanwhile, back in rainy old Britain, Ensemblian Ewan Birney scores a formidable sixteen, which is just scary for a bloke in his thirties.

Where NSPNAS doesn’t work

Unfortunately, authors with common names like John Smith (who has more than 340 hits) can’t be easily benchmarked with this type of query, without trawling through hundreds of false positives. More importantly, some influential scientists score very low or zero, despite the fact that their work has been important in the world of biomedical science an beyond. This is especially true for Computer Scientists, Mathematicians and Informaticians, for example:

Many important members of the Dead Scientists Society also have low NSPNAS scores…

Conclusions

All these statistics remind us that many important ideas, techniques and results are not published in Nature, Science or PNAS and others are excluded from the PubMed index completely. It also confirms what we already know about peer-reviewed Journal publications not being the be-all and end-all of Engineering, Science or Medicine [3]. But NSPNAS still has its uses, provided the people you’re benchmarking have a rare name and didn’t snuff it before the PubMed index starts.

What is your NSPNAS score? If like me, you score a spectacular “nul points”, console yourself with the fact that you’re in good company with that score and given time, maybe you can change it.

References

  1. Jimmy Page and Robert Plant (1971) Stairway to Heaven
  2. Most of the Clay Mathematics Institute Millenium Prizes are still up for grabs if you get disillusioned with bioinformatics, fancy some fame and winning a million dollar fortune!
  3. Michael Seringhaus and Mark Gerstein (2007) Publishing perishing? Towards tomorrow’s information architecture BMC Bioinformatics 2007, 8:17 DOI:10.1186/1471-2105-8-17
  4. This post originally on nodalpoint, with comments

Creative Commons License

This work is licensed under a

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


« Previous PageNext Page »

Blog at WordPress.com.