- Details of nearly a million BBC radio and TV programmes, dating back 75 years
- Over 500,000 subject categories, from DNA and H5N1 avian bird flu to Genomes and Genetics Research
- Over a million contributors and appearances, from Nobel Prize winner Paul Nurse and Alan Turing to Craig Venter, Francis Crick and Albert Einstein
Unfortunately this catalogue currently includes no data, only metadata at the moment, so there are no audio or video streams yet, as this is an experimental prototype. As mentioned earlier the catalogue is based on RDF which will no doubt please Semantic Webhead Tim Berners-Lee and allows the database to be queried with SPARQL. One of the brains behind this is Matt Biddulph.
I wonder if a similar application could be built using the UniProt protein sequence and annotation data in RDF or the data currently being produced by the W3C BioRDF subgroup? Compared to biological databases the BBC catalogue is relatively small, although there are no figures on the size of the catalogue, which has been extensively hand-curated by experts over the years. The ratio of metadata to data is probably different too, where a typical biological database might have lots of data (e.g. raw protein sequence data) but poor quality and a low quantity of metadata (interactions, structures, functions etc).
However, this catalogue is an interesting prototype, which is addictively fun to play with and might spark a few imaginations in the bioinformatics community.
[update: seeAlso Alf Eaton’s visual TouchGraph of BBC TV/Radio Collaborators which allows you to browse this data more graphically. Unfortunately, this fantastic BBC Database is not always online. This post was originally posted on nodalpoint with comments].
This work is licensed under a