O'Really?

April 1, 2010

Καλό Πάσχα: Happy Easter: Frohe Ostern

Easter Bunny 1 "My arse hurts"... Easter Bunny 2 "What?"Whatever your inclination, it’s difficult to ignore that sandwiched between the Vernal equinox and Beltane, it’s Easter time already. So Happy Easter, Frohe Ostern or Καλό Πάσχα, as they say down south, to all readers of this O’Really? blog.

If you’re gorging yourself on chocolate (see picture right), you might like to consider the food science behind it all thanks to a book [1] by Stephen Beckett published by the Royal Society of Chemistry. Ever wondered why melted chocolate that is put back in the fridge doesn’t quite taste the same? This book will tell you, and a whole lot more besides.

As for the chocolate bunnies, my language skills (german and greek) are a load of old arsch, but the cartoon over on the right roughly translates as follows.

  • Chocolate Bunny #1, with missing posterior: “Mein arsch tut weh!!” (“My arse hurts!!”)
  • Chocolate Bunny #2, with missing ears: “Was?” (“What?”)

Happy Chocolatey Holidays wherever you are…

References

  1. Stephen T. Beckett (2000). The Science Of Chocolate Royal Society of Chemistry Publihshing DOI: 10.1039/9781847552143
  2. Anon (2010). Book Review of The Science of Chocolate The Science of Chocolate. By Stephen T. Beckett (Nestle Product Technology Centre). Royal Society of Chemistry , Cambridge . 2008. xii + 240 pp. 6 × 9 in. £24.95. ISBN 78-0-85404-970-7. Journal of Natural Products DOI: 10.1021/np100172s

March 17, 2010

Hunkin’s Hypothesis: Technology Is What Makes Us Human

Tim Hunkin: Technology is What Makes Us HumanCartoonist and engineer Tim Hunkin is probably best known for his exhibits at the Science Museum in London and his Under The Pier Show “a mad arcade of home-made slot machines & simulator rides on Southwold Pier, Suffolk”.  His website is a treasure trove of weird and wonderful things.

Tim has an interesting proposition, let’s call it Hunkin’s Hypothesis [1], that technology is what makes us human:

“Technology isn’t just something outside ourselves, it’s an innate part of human nature, like sex, sleeping or eating, and that its been a major driving force in evolution. Tool using, along with language and bipedalism, is essentially what makes us human. The complicated theories used to explain why we first stood up are largely unnecessary. Our hands simply became too useful for holding tools to waste them on walking.”

He bases this idea, on a paper published by Frances Evans [2] about the creative engineering mind. This idea has at least two important implications:

  1. Engineering is a creative and intellectual process that humans do instinctively, not an obsolete and dying skill practiced by dinosaurs
  2. Engineering is an essential part of education, that needs to be taught more in schools and universities. Tim encourages his grand-children to use spot-welders, glue-guns and soldering irons at every opportunity! In UK schools health and safety regulations, plus the fear of being sued often make this tricky.

I’m not sure what to make of Hunkin’s Hypothesis yet, but it’s an intriguing idea that deserves investigation.

References

  1. Tim Hunkin (2006). Technology is what makes us human. timhunkin.com
  2. Frances Evans (1998). Two legs, thing using and talking: The origins of the creative engineering mind AI & Society, 12 (3), 185-213 DOI: 10.1007/BF01206195
  3. Tim Hunkin – The Seaside Inventor, Southwold Pier, Suffolk

[Picture of Tim Hunkin taken from his talk at Cambridge Science Festival, 2010.]

March 16, 2010

DNA, Diversity and You at Cambridge Science Festival

Sequence BraceletsAs part of Cambridge Science festival last weekend, I joined a group of about 40 volunteers from The Sanger and EBI at an event “DNA, diversity and you”. This was a series of education and outreach events designed to explore how differences in your genetic code make you different from other individuals, and what makes the humans different from other living things –  with a bit of computational biology thrown in for good measure.  Here are some notes on a selection of the activities, in case you ever find yourself trying to explain biology, computer science or bioinformatics to anyone aged 4-18 and beyond. These resources are all tried, tested and fun to work with, for students and teachers alike:

  1. DNA origami create your own origami DNA molecule, and hands on way of learning abou tthe double helix structure of DNA
  2. DNA sequence bracelets (see picture right). Thread coloured beads according to sequence sections from a range of organisms including trout, chimpanzee, butterfly, a flesh-eating microbe and rotting corpse flower.
  3. Yummy gummy DNA (under 5’s) build your own DNA helix out of sweets and cocktail sticks. Then scoff it all afterwards.
  4. What’s my name in DNA? find out what your name is in DNA, and what the corresponding (hypothetical) protein is using software from deCODE.
  5. Function Finders translate DNA into a sequence of amino acids using wooden translator blocks, then find out which organism the amino acid sequence is from.
  6. Genome sizes (with seatbelts) Rank organisms (inc. human, zebrafish, mosquito, sugar cane and yeast) and find out if they are in the right order. Results are often not what you would expect.
  7. Play your genes right. A card-based guessing game which compares the number of genes in the human genome with the number of genes from a range of different organisms include the flu virus, E. coli bacteria, armadillo, rice plant and others.
  8. Genome Jigsaws for illustrating the process of finishing supposedly “finished” genomes, by putting together a square sequence jigsaw following base pairing rules to end up with a complete finished square.
  9. DNA Time Team examines of aspects ancestry and evolution. The activity encourages people to work out the sequence of a common ancestor by filling in the gaps on a simple evolutionary tree.
  10. Spot the difference with proteins. Comparing Heat Shock Protein (HSP) in human and other organisms to illustrate how different regions of the protein vary between different organisms and how this affects function.
  11. Ready, steady sort: a sorting network that demonstrates one technique that computers use to sort through large amounts of information like sequence data. This comes straight from Computer Science Unplugged by Tim Bell, Mike Fellows and Ian Witten. This activity can be done either as a smaller board game, or as a larger floor game. Either way, it’s a lot of fun, especially if you time people for an added competitive element (see video below)

There were a whole bunch of new activities at the festival this year, maybe these will appear on the your genome website in the future. Anyway, it was great fun to get involved, there is nothing quite like the challenge of explaining parallel computing to young kids, teenagers and their parents – actually much easier than you’d think if you’ve got access to great teaching materials.

Thanks to Francesca Gale and Louisa Wright for all the hard work that went into organising this fun and successful event.

March 8, 2010

Cambridge Science Festival, 8th-21st March 2010

Cambridge Science Festival, 8-21 March 2010Madder than the Mad March Hare, more entertaining and surreal than Alice down-a-rabbit-hole in Wonderland: today marks the start of this years Cambridge Science festival:

“Delve into the diversity of science at the Cambridge Science Festival 2010! All aspects of science, technology, engineering and mathematics will be available to visitors of all ages at more than 150 mostly free events over two weeks. This year is the International Year of Biodiversity and the Festival is celebrating this by inviting you to learn more about the colourful creatures on the land and beneath the waves at the many events on offer in University departments and museums.

This year, a Schools Zone has been added into the programme of events, where pupils from local schools will be showcasing their work with interactive exhibits at the University Centre on the 20th March.

Also look out for scientists from the BBSRC in the Grafton Centre during the Festival, who will be on hand to answer your tricky science questions. Watch out for video and audio coverage before and during the Festival on the Guardian website.”

A team of scientists and engineers from the Wellcome Trust Sanger Institute and The EBI will be participating, on Saturday 13th March with a session on DNA, diversity and you and also tackling the thorny issue of Who Owns Science? on Friday 19th March. So if you’re in or near Cambridge over the next couple of weeks, come and say hello, and check out the  details in the full programme.

March 4, 2010

Sildenafil citrate: Entity of the Month

30 St Mary Axe or the Gherkin - London by Patrick MayonRelease 66 of Chemical Entities of Biological Interest (ChEBI) is now available, containing 534,521 total entities, of which 20,151 are annotated entities and 698 were submitted via the ChEBI submission tool. This months entity of the month is Viagra, also known as Sildenafil citrate: (Text below reproduced from ChEBI website)

Few chemical compounds are better known to the general public than sildenafil citrate (CHEBI:58987), traded under the name of “Viagra”.The compound was first synthesised by chemists working at Pfizer, with a view to using it for the treatment of hypertension and angina pectoris. Whilst having been found to be ineffective against angina in clinical trials, it has been observed to induce penile erections and was therefore marketed by Pfizer as a drug for the treatment of erectile dysfunction.

A number of synthetic routes for the preparation of the parent sildenafil have been reported [1]. A common industrial synthetic route is through reaction of 4-amino-1-methyl-3-N-propylpyrazole-5-carboxamide and 2-ethoxy-5-(4-methylpiperazin-1-yl)sulfonylbenzoic acid followed by subsequent cyclisation to sildenafil through heating under acidic conditions.

Sildenafil has been shown to be an inhibitor of cyclic guanosine monophosphate specific phosphodiesterase type 5, an enzyme which is responsible for the degradation of 3′,5′-cyclic GMP (cyclic guanosine monophosphate, cGMP) in the corpus cavernosum. This leads to the presence of increased levels of cGMP, which, in turn causes vasodilation of the helicine arteries and thus increased blood flow into the spongy tissue of the penis [2].

Apart from the treatment of sexual dysfunction, sildenafil is also used in the treatment of pulmonary arterial hypertension and works again through relaxation of the arterial wall, which leads to a decrease in arterial resistance [3]. Furthermore – and arguably most interestingly – sildenafil has been found to decrease the time necessary for the re-entrainment of circadian rhythms after phase advances in the light–dark cycle (such as occur on transmeridian eastbound flights) in members of the Cricetidae family [4]*. The discovery was rewarded with the award of an Ig Nobel Prize in Aviation in 2007.

* or as wikipedia puts it…”Viagra aids jet lag recovery in hamsters” …that’s an interesting side effect.

References

  1. Dunn, P. (2005). Synthesis of Commercial Phosphodiesterase(V) Inhibitors Organic Process Research & Development, 9 (1), 88-97 DOI: 10.1021/op040019c
  2. Webb DJ, Freestone S, Allen MJ, & Muirhead GJ (1999). Sildenafil citrate and blood-pressure-lowering drugs: results of drug interaction studies with an organic nitrate and a calcium antagonist. The American journal of cardiology, 83 (5A) PMID: 10078539
  3. Richalet, J. (2004). Sildenafil Inhibits Altitude-induced Hypoxemia and Pulmonary Hypertension American Journal of Respiratory and Critical Care Medicine, 171 (3), 275-281 DOI: 10.1164/rccm.200406-804OC
  4. Agostino, P., Plano, S., & Golombek, D. (2007). Sildenafil accelerates reentrainment of circadian rhythms after advancing light schedules Proceedings of the National Academy of Sciences, 104 (23), 9834-9839 DOI: 10.1073/pnas.0703388104

[Creative Commons licensed picture of 30 St Mary Axe or the Gherkin – London by Patrick Mayon, see comments on this post at friendfeed]

February 25, 2010

Apache Maven: A Misbehavin’ Build Tool?

Filed under: ChEBI,programming,technology — Duncan Hull @ 11:00 am
Tags: , , , , , ,

Chocolate Tools by JanneMOne of the many tools we use in our team to manage the development of the ChEBI software is an automated build tool called Apache Maven. Opinions are often divided on whether Maven is a good or a bad thing. Most of them are very subjective, argumentative and often very extended. See why does Maven have such a bad reputation? and 25 things* I hate about Maven for examples.

All this is fairly predictable, and I could add a few tales of Maven woe to the pile myself. But wondering if Maven is any good reminded me of something Bjarne Stroustrup [1,2,3] (one of the people behind the C++ programming language) once said in an article on the problem with programming:

“There are just two kinds of [programming] languages: the ones everybody complains about and the ones nobody uses.”

Actually when you think about it this applies to build systems too, there are two kinds. It also applies to just about any technology you care to name, you can crudely classify them all into two categories:

  1. Those technologies everybody complains about…
  2. … and the rest, that nobody uses.

So is Maven any good? Worth using? Worth the pain? Depends on who you ask. What we can say for sure, is that like many technologies, everybody complains about it.

References

  1. Bjarne Stroustrup (2010). Viewpoint: What should we teach new software developers? Why? Communications of the ACM, 53 (1) DOI: 10.1145/1629175.1629192
  2. Bjarne Stroustrup (2007). Evolving a language in and for the real world: C++ 1991-2006 Proceedings of the third ACM SIGPLAN conference on History of programming languages DOI: 10.1145/1238844.1238848
  3. Bjarne Stroustrup (1993). A history of C++: 1979–1991 The second ACM SIGPLAN conference on History of programming languages DOI: 10.1145/154766.155375

* Only 25? That seems like quite a short list to me.

[CC-licensed Chocolate Tools image by JanneM, some commentary on this post over at friendfeed.]

February 12, 2010

The 3rd OBO Foundry Workshop 2010, Cambridge, UK

Ultrawide Wellcome Trust Genome Campus, Cambridge by Tim NugentThe Open Biomedical Ontologies (OBO) [1] are a set of reference ontologies for describing all kinds of biomedical data shared in a centralised repository called The OBO Foundry. Every year, users and developers of these ontologies gather from around the globe for a workshop at the EBI near Cambridge, UK. Following on from the first workshop two years ago, the second workshop last year it’s already time for the third workshop on February 15th-16th. All the details and agenda are here if you’re interested. This workshop is possible thanks to sponsorship from the BBSRC funds for Workshop on Data Standards and the EU ELIXIR ‘Data Integration & Interoperability’ Package 7.

[Update: outcomes from the workshop are available here, along with a summary of discussion from Monday and a summary of discussion from Tuesday.]

References

  1. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L., Eilbeck, K., Ireland, A., Mungall, C., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S., Scheuermann, R., Shah, N., Whetzel, P., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration Nature Biotechnology, 25 (11), 1251-1255 DOI: 10.1038/nbt1346

[Ultrawide panoramic picture of the Wellcome Trust Genome Campus by Tim Nugent, as featured on the cover of the EMBL-EBI Annual Scientific Report 2009. Making those pictures looks like a lot of fun.]

February 5, 2010

Classic paper: Montagues and Capulets in Science

Romeo and Juliet by HappyHippoSnacksIn preparation for a joint seminar I’ll be doing with Midori Harris here at the EBI, here’s a classic paper [1,2] on the social problems of building biomedical ontologies. This paper is worth reading (or re-reading) because it makes lots of relevant points about the use and abuse of research and how people misunderstand each other [3]. It’s funny (and available Open Access too) plus how many papers do you read with an abstract written in the style of Big Bard Bill Shakespeare?

ABSTRACT: Two households, both alike in dignity, In fair Genomics, where we lay our scene, (One, comforted by its logic’s rigour, Claims ontology for the realm of pure, The other, with blessed scientist’s vigour, Acts hastily on models that endure), From ancient grudge break to new mutiny, When ‘being’ drives a fly-man to blaspheme. From forth the fatal loins of these two foes, Researchers to unlock the book of life; Whole misadventured piteous overthrows, Can with their work bury their clans’ strife. The fruitful passage of their GO-mark’d love, And the continuance of their studies sage, Which, united, yield ontologies undreamed-of, Is now the hour’s traffic of our stage; The which if you with patient ears attend, What here shall miss, our toil shall strive to mend.

So if you read the paper, you have to ask yourself, are you a Montague or a Capulet?

References

  1. Carole Goble and Chris Wroe (2004). The Montagues and the Capulets Comparative and Functional Genomics, 5 (8), 623-632 DOI: 10.1002/cfg.442
  2. Carole Goble (2004) The Capulets and Montagues: A plague on both your houses?, SOFG: Standards and Ontologies for Functional Genomics
  3. William Shakespeare (1596) Romeo and Juliet

[Romeo and Juliet picture via Happy Hippo Snacks]

January 21, 2010

Blogging a Book about Bio-Ontologies

Waterloo Station Ultrawide Panoramic by Tim NugentIf you wanted to write a guide to Biomedical and Biological Ontologies [1], especially the what, why, when, how, where and who, there are at least three choices for publishing your work:

  1. Journal publishing in your favourite scientific journal.
  2. Book publishing with your favourite academic or technical publisher.
  3. Self publishing on a web blog with your favourite blogging software.

Each of these has its own unique problems:

  • The trouble with journals is that they typically don’t publish “how to” guides, although you might be able to publish some kind of review.
  • The trouble with books, and academic books in particular, is that people (and machines) often don’t read them. Also, academic books can be prohibitively expensive to buy and this can make the data inside them less visible and accessible to the widest audience. Unfortunately all that lovely knowledge gets locked up behind publishers paywalls. To add insult to injury, most academic books take a very long time to publish, often several years. By the time of printing, the content of many academic books is often very dated.
  • The trouble with blogs, they aren’t peer-reviewed in the traditional way and they tend to be written by a single person from a not very neutral point of view. Or as Dave once put it “vanity publishing for arrogant people with an inflated ego“. Ouch.

So the people behind the Ontogenesis network (Robert Stevens and Phillip Lord with funding from the EPSRC grant ref: EP/E021352/1) had an idea. Why not blog a book about Ontology? As a publishing experiment – it might just work by combining the merits of books and blogs together in order to overcome their shortcomings. This will involve getting a small group of about twenty people (mostly bio-ontologists) together, and writing about what an ontology is, why you would want to a biomedical ontology, how to build one and so on. We will be doing some of the peer-review online too.

As part of an ongoing experiment, we are posting all this information on a blog called http://ontogenesis.knowledgeblog.org if you’d like to follow, subscribe to the feed and read the manifesto.

References

  1. Yu, A. (2006). Methods in biomedical ontology Journal of Biomedical Informatics, 39 (3), 252-266 DOI: 10.1016/j.jbi.2005.11.006

[Ultrawide panoramic picture of Waterloo station by Tim Nugent]

January 15, 2010

Bio2RDF: Large Scale, Distributed Biological Knowledge Discovery

Filed under: ChEBI — Duncan Hull @ 2:11 pm
Tags: , , , , , , ,

Bio2RDFMichel Dumontier was visiting the EBI this week, here’s the details of his seminar Bio2RDF and Beyond! Large Scale, Distributed Biological Knowledge Discovery (slides embedded below) for anyone interested who missed it:

Abstract: The Bio2RDF.org [1] project aims to transform silos of bioinformatics data into a distributed platform for biological knowledge discovery. Initial work focused on building a public database of open-linked data with web-resolvable identifiers that provides information about named entities. This involved a syntactic normalization to convert open data represented in a variety of formats (flatfile, tab, xml, web services) to RDF-based linked data with normalized names (HTTP URIs) and basic typing from source databases. Bio2RDF entities also make reference to other open linked data networks (e.g. dbPedia) thus facilitating traversal across information spaces. However, a significant problem arises when attempting to undertake more sophisticated knowledge discovery approaches such as question answering or symbolic data mining. This is because knowledge is represented in a fundamentally different manner, requiring one to know the underlying data model and reconcile the artefactual differences when they arise. In this talk, we describe our data integration strategy that makes use of both syntactic and semantic normalization to consistently marshal knowledge to a common data model while leveraging explicit logic-based mappings with community ontologies to further enhance the biological knowledgescope. Coupled with the web-service based Semantic Automated Discovery and Integration (SADI) framework, Bio2RDF is well placed to serve up biological data for prediction and analysis.

Some quick notes: Bio2RDF is currently indexing around 5 billion triples, and is built with the open source Virtuoso database. There are some scalability issues in making the system cope with up to a total of 15+ billion triples currently required. There is nothing in Bio2RDF yet that deals with the redundancy problem, e.g. “buggotea” and its friends.

References

  1. Belleau, F., Nolin, M., Tourigny, N., Rigault, P., & Morissette, J. (2008). Bio2RDF: Towards a mashup to build bioinformatics knowledge systems Journal of Biomedical Informatics, 41 (5), 706-716 DOI: 10.1016/j.jbi.2008.03.004
« Previous PageNext Page »

Blog at WordPress.com.