April 30, 2010

Daniel Cohen on The Social Life of Digital Libraries

Day 106 - I am a librarian by cindiann, on FlickrDaniel Cohen is giving a talk in Cambridge today on The Social Life of Digital Libraries, abstract below:

The digitization of libraries had a clear initial goal: to permit anyone to read the contents of collections anywhere and anytime. But universal access is only the beginning of what may happen to libraries and researchers in the digital age. Because machines as well as humans have access to the same online collections, a complex web of interactions is emerging. Digital libraries are now engaging in online relationships with other libraries, with scholars, and with software, often without the knowledge of those who maintain the libraries, and in unexpected ways. These digital relationships open new avenues for discovery, analysis, and collaboration.

Daniel J. Cohen is an Associate Professor at George Mason University and has been involved in the development of the Zotero extension for the Firefox browser that enables users to manage bibliographic data while doing online research. Zotero [1] is one of many new tools [2] that are attempting to add a social dimension to scholarly information on the Web, so this should be an interesting talk.

If you’d like to come, the talk starts at 6pm in Clare College, Cambridge and you need to RSVP by email via the talks.cam.ac.uk page


  1. Cohen, D.J. (2008). Creating scholarly tools and resources for the digital ecosystem: Building connections in the Zotero project. First Monday 13 (8)
  2. Hull, D., Pettifer, S., & Kell, D. (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web PLoS Computational Biology, 4 (10) DOI: 10.1371/journal.pcbi.1000204

June 2, 2009

Michael Ley on Digital Bibliographies

Michael Ley

Michael Ley is visiting Manchester this week, he will be doing a seminar on Wednesday 3rd June, here are some details for anyone who is interested in attending:

Date: 3rd Jun 2009

Title: DBLP: How the data get in

Speaker: Dr Michael Ley. University of Trier, Germany

Time & Location: 14:15, Lecture Theatre 1.4, Kilburn Building

Abstract: The DBLP (Digital Bibliography & Library Project) Computer Science Bibliography now includes more than 1.2 million bibliographic records. For Computer Science researchers the DBLP web site now is a popular tool to trace the work of colleagues and to retrieve bibliographic details when composing the lists of references for new papers. Ranking and profiling of persons, institutions, journals, or conferences is another usage of DBLP. Many scientists are aware of this and want their publications being listed as complete as possible.

The talk focuses on the data acquisition workflow for DBLP. To get ‘clean’ basic bibliographic information for scientific publications remains a chaotic puzzle.

Large publishers are either not interested to cooperate with open services like DBLP, or their policy is very inconsistent. In most cases they are not able or not willing to deliver basic data required for DBLP in a direct way, but they encourage us to crawl their Web sites. This indirection has two main problems:

  1. The organisation and appearance of Web sites changes from time to time, this forces a reimplementation of information extraction scripts. [1]
  2. In many cases manual steps are necessary to get ‘complete’ bibliographic information.

For many small information sources it is not worthwhile to develop information extraction scripts. Data acquisition is done manually. There is an amazing variety of small but interesting journals, conferences and workshops in Computer Science which are not under the umbrella of ACM, IEEE, Springer, Elsevier etc. How they get it often is decided very pragmatically.

The goal of the talk and my visit to Manchester is to start a discussion process: The EasyChair conference management system developed by Andrei Voronkov and DBLP are parts of scientific publication workflow. They should be connected for mutual benefit?


  1. Lincoln Stein (2002). Creating a bioinformatics nation: screen scraping is torture Nature, 417 (6885), 119-120 DOI: 10.1038/417119a

May 19, 2009

Defrosting the John Rylands University Library

Filed under: seminars — Duncan Hull @ 4:14 pm
Tags: , , , , , , , , , , , ,

http://www.flickr.com/photos/dpicker/3107856991/For anyone who missed the original bioinformatics seminar I’ll be doing a repeat of the “Defrosting the Digital Library” talk, this time for the staff in the John Rylands University Library (JRUL) . This is the main academic library in Manchester with (quote) “more than 4 million printed books and manuscripts, over 41,000 electronic journals and 500,000 electronic books, as well as several hundred databases, the John Rylands University Library is one of the best-resourced academic libraries in the country.” The journal subscription budget of the library is currently around £4 million per year, that’s before they’ve even bought any books! Here is the abstract for the talk:

After centuries with little change, scientific libraries have recently experienced massive upheaval. From being almost entirely paper-based, most libraries are now almost completely digital. This information revolution has all happened in less than 20 years and has created many novel opportunities and threats for scientists, publishers and libraries.

Today, we are struggling with an embarrassing wealth of digital knowledge on the Web. Most scientists access this knowledge through some kind of digital library, however these places can be cold, impersonal, isolated, and inaccessible places. Many libraries are still clinging to obsolete models of identity, attribution, contribution, citation and publication.

Based on a review published in PLoS Computational Biology, pubmed.gov/18974831 this talk will discuss the current chilly state of digital libraries for biologists, chemists and informaticians, including PubMed and Google Scholar. We highlight problems and solutions to the coupling and decoupling of publication data and metadata, with a tool called citeulike.org. This software tool (and many other tools just like it) exploit the Web to make digital libraries “warmer”: more personal, sociable, integrated, and accessible places.

Finally issues that will help or hinder the continued warming of libraries in the future, particularly the accurate identity of authors and their publications, are briefly introduced. These are discussed in the context of the BBSRC funded REFINE project, at the National Centre for Text Mining (NaCTeM.ac.uk), which is linking biochemical pathway data with evidence for pathways from the PubMed database.

Date: Thursday 21st May 2009, Time: 13.00, Location: John Rylands University (Main) Library Oxford Road, Parkinson Room (inside main entrance, first on right) University of Manchester (number 55 on google map of the Manchester campus). Please come along if you are interested…


  1. Hull, D., Pettifer, S., & Kell, D. (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web PLoS Computational Biology, 4 (10) DOI: 10.1371/journal.pcbi.1000204

[CC licensed picture above, the John Rylands Library on Deansgate by dpicker: David Picker]

Blog at WordPress.com.