Rutherford Appleton Laboratory (RAL) and the University of Southampton.Some notes from a workshop on blog-based laboratory notebooks “LaBlogs” (Laboratory Logs / Weblogs) held at the Cosener’s House in Abingdon, Oxfordshire, hosted by Cameron Neylon, a protein biochemist based at the
Introductions: Who attended
- Cameron Neylon
- Steve Wilson, a PhD student of Jeremy Frey, Southampton
- Dafydd Jones, Cardiff University, structural biochemistry, directed evolution
- Amy Baldwin PostDoc working for Dafydd Jones
- Louise Wilson lab manager at RAL for Cameron
- Karin Nordgren, University of Southampton, chemistry
- Duncan Hull (Yours Truly), MIB, University of Manchester
- Paolo Missier, University of Manchester
- Peter Darch, PhD student at ot Oxford University Computing Laboratory
- Guillaume Boucher postdoc at southampton
- Lucy Power, PhD student at the Oxford Internet Institute (oii.ac.uk), formerly ingenta, investigating social impact of internet on biologists
- Jenny (surname?) PhD student for Cameron, working on directed evolution
- Luke Clifton, RAL, biological surface science
One of the main aims of the workshop was to investigate the use of using blogs as laboratory notebooks. The primary system discussed and used were Chemtools, myexperiment and Taverna. Chemtools is an electronic lab notebook-cum-blog project led by Jeremy Frey. Much of the notebook system was built by Andrew Milstead, using custom PHP scripts
Blog based lab notebooks Questionnaire
A questionnaire on blog based lab notebooks framed most of the first days discussion. Some questions and sample answers are shown below.
Question: Can you describe in one or two sentences what is the purpose of a laboratory notebook?
- “to record ideas, protocols, experimental data, process and outcome”
- “a notebook for thought experiments”
- “to record experiments so they can be repeated”
- “to recovering from disasters (e.g. lab burning down) for backup, archival”
Question: What level of detail is appropriate for a lab notebok?
Ideally, a lab notebook should record EVERYTHING, but in practice this is not possible because of the time and effort costs. Some things are very difficult to record, and unlikely to be of any value. e.g. batch numbers, manufacturer of reagents, not always practical to record everything.
Industrial labs need to record information for a patent, where there is intellectual property, especially in industry where it might be used as primary evidence in resolving patent disputes, notebook signed off by lab head. Academic lab notebooks generally record less
Question: do you regularly include the following in your lab notebook?
- time and date
- experiment number or name
- material batch numbers
- supervisors signature
- references to all data
- print outs of all data
- detailed protocols
- safety data
Answers to these questions were very varied
Question: Where is the raw data stored?
- my computer
- various lab computers
- lab server
- supervisors computer
- file of printouts
- printouts in lab book
Question: is your raw data backed up
possible answers: Yes, no, important stuff is, don’t know etc
Question: How long should data be archived for?
The cost of archiving paper has been estimate at £600 per cubic metre per year
The cost of creating and managing archival quality digital material has been estimated at £50 per year per megabyte. Proper archival is expensive as data is periodicaly and automatically copied forward on to new hardware. Most of the cost of archival is maintainance not the basic storage
In commercial settings, scientists are often advised to actively destroy lab notebooks because all the data needs to be checked by patent lawyers which can cost 500 pounds per hour. At southampton, lab books are currently destroyed three years after they are handed in. This goes against BBSRC and MRC policy
Question: Data sharing and archival: do you agree with the following?
yes | no | don't know
- I own the data I generate
- My institution owns the data I generate
- My funding body owns the data I generate
- I have moral rights over the data I generate
- It is my responsibility to ensure the archival of my research results
- The results of government funded research should be freely available
- I have the responsibility to share my results with other researchers
- I have the responsibility to share my raw data with other researchers
The Wellcome Trust now claims some IP on research performed
various UK funding bodies have policies on data sharing and archival.
some of these suggest standards for the amount of time data should be archived
- EPSRC: no policy, they don’t care
- BBSRC and MRC have the same policy: 10 years
- Wellcome Trust have a policy of lab notebooks being archived for 10 years, as ascii, tiff, PDF/A (an archived not-proprietary form of pdf)
After the Questionnaire there was a demonstration of http://chemtools.chem.soton.ac.uk/. Every object and every procedure in the lab, has a blog post (a URL). That is a lot of posts. Tables are important but difficult to code. Some posts can be automated, for example as a script or worklfow, but RAL haven’t automated all posts, not immediately obvious how to do this
Much of the data comes for free from COSHH. Every sample has a bar code for example http://chemtools.chem.soton.ac.uk/uri/b10 and these can be printed
Data entry problems cause headaches PCR (Polymerase Chain Reaction) or PRC (typo)? There are templates for generating posts. e.g. the PCR template lists different types of thermal cycling, lists primers and type of DNA (template what the pcr is copying)
oligonucleotides etc. Users aren’t forced to use controlled vocabulary but are encouraged. A typical experiment might have around 30 templates but these need constantly re-writing
Data models don’t survive contact with reality, so simple data models like table and text (the universal data model for bioinformatics) are very commonly used. Paolo points out that the data layer is completely bypassed.
Project uses similie project timeplot to plot temperature, light and humidity in the lab and also similie project timeline, to provide another alternative view of the lab notebook and pumped into Yahoo! Pipes! (pipes.yahoo.com)
There are Java applets to display agarose gels and spectra etc these can be annotated directly.
Blogging is provenance
A web log or blog is what Computer Scientists call provenance. Once a model of provenance is created (what chemtools calls “templates”), Taverna and/or myExperiment could be used to automatically populate that model with data.
[Lab picture credit Stacina on Flickr]
This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.