O'Really?

January 18, 2013

How to export, delete and move your Mendeley account and library #mendelete

Deleteme

Delete. Creative Commons licensed picture by Vitor Sá – Virgu via Flickr.com

News that Reed Elsevier is in talks to buy Mendeley.com will have many scientists reaching for their “delete account” button. Mendeley has built an impressive user-base of scientists and other academics since they started, but the possibility of an Elsevier takeover has worried some of its users. Elsevier has a strained relationship with some groups in the scientific community [1,2], so it will be interesting to see how this plays out.

If you’ve built a personal library of scientific papers in Mendeley, you won’t just want to delete all the data, you’ll need to export your library first, delete your account and then import it into a different tool.

Disclaimer: I’m not advocating that you delete your mendeley account (aka #mendelete), just that if you do decide to, here’s how to do it, and some alternatives to consider. Update April 2013, it wasn’t just a rumour.

Exporting your Mendeley library

Open up Mendeley Desktop, on the File menu select Export. You have a choice of three export formats:

  1. BibTeX (*.bib)
  2. RIS – Research Information Systems (*.ris)
  3. EndNote XML (*.xml)

It is probably best to create a backup in all three formats just in case as this will give you more options for importing into whatever you replace Mendeley with. Another possibility is to use the Mendeley API to export your data which will give you more control over how and what you export, or trawl through the Mendeley forums for alternatives. [update: see also comments below from William Gunn on exporting via your local SQLite cache]

Deleting your Mendeley account #mendelete

Login to Mendeley.com, click on the My Account button (top right), Select Account details from the drop down menu and scroll down to the bottom of the page and click on the link delete your account. You’ll be see a message We’re sorry you want to go, but if you must… which you can either cancel or select Delete my account and all my data. [update] To completely delete your account you’ll need to send an email to privacy at mendeley dot com. (Thanks P.Chris for pointing this out in the comments below)

Alternatives to Mendeley

Once you have exported your data, you’ll need an alternative to import your data into. Fortunately, there are quite a few to choose from [3], some of which are shown in the list below. This is not a comprehensive list, so please add suggestions below in the comments if I missed any obvious ones. Wikipedia has an extensive article which compares all the different reference management software which is quite handy (if slightly bewildering). Otherwise you might consider trying the following software:

One last alternative, if you are fed up with trying to manage all those clunky pdf files, you could just switch to Google Scholar which is getting better all the time. If you decide that Mendeley isn’t your cup of tea, now might be a good time to investigate some alternatives, there are plenty of good candidates to choose from. But beware, you may run from the arms of one large publisher (Elsevier) into the arms of another (Springer or Macmillan which own Papers and ReadCube respectively).

References

  1. Whitfield, J. (2012). Elsevier boycott gathers pace Nature DOI: 10.1038/nature.2012.10010
  2. Van Noorden, R. (2013). Mathematicians aim to take publishers out of publishing Nature DOI: 10.1038/nature.2013.12243
  3. Hull, D., Pettifer, S., & Kell, D. (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web PLoS Computational Biology, 4 (10) DOI: 10.1371/journal.pcbi.1000204
  4. Attwood, T., Kell, D., McDermott, P., Marsh, J., Pettifer, S., & Thorne, D. (2010). Utopia documents: linking scholarly literature with research data Bioinformatics, 26 (18) DOI: 10.1093/bioinformatics/btq383

July 15, 2010

How many journal articles have been published (ever)?

Fifty Million and Fifty Billion by ZeroOne

According to some estimates, there are fifty million articles in existence as of 2010. Picture of a fifty million dollar note by ZeroOne on Flickr.

Earlier this year, the scientific journal PLoS ONE published their 10,000th article. Ten thousand articles is a lot of papers especially when you consider that PLoS ONE only started publishing four short years ago in 2006. But scientists have been publishing in journals for at least 350 years [1] so it might make you wonder, how many articles have been published in scientific and learned journals since time began?

If we look at PubMed Central, a full-text archive of journals freely available to all – PubMedCentral currently holds over 1.7 million articles. But these articles are only a tiny fraction of the total literature – since a lot of the rest is locked up behind publishers paywalls and is inaccessible to many people. (more…)

June 2, 2009

Who Are You? Digital Identity in Science

The Who by The WhoThe organisers of the Science Online London 2009 conference are asking people to propose their own session ideas (see some examples here), so here is a proposal:

Title: Who Are You? Digital Identity in Science

Many important decisions in Science are based on identifying scientists and their contributions. From selecting reviewers for grants and publications, to attributing published data and deciding who is funded, hired or promoted, digital identity is at the heart of Science on the Web.

Despite the importance of digital identity, identifying scientists online is an unsolved problem [1]. Consequently, a significant amount of scientific and scholarly work is not easily cited or credited, especially digital contributions: from blogs and wikis, to source code, databases and traditional peer-reviewed publications on the Web. This (proposed) session will look at current mechanisms for identifying scientists digitally including contributor-id (CrossRef), researcher-id (Thomson), Scopus Author ID (Elsevier), OpenID, Google Scholar [2], Single Sign On, PubMed, Google Scholar [2], FOAF+SSL, LinkedIn, Shared Identifiers (URIs) and the rest. We will introduce and discuss each via a SWOT analysis (Strengths, Weaknesses, Opportunities and Threats). Is digital identity even possible and ethical? Beside the obvious benefits of persistent, reliable and unique identifiers, what are the privacy and security issues with personal digital identity?

If this is a successful proposal, I’ll need some help. Any offers? If you are interested in joining in the fun, more details are at scienceonlinelondon.org

References

  1. Bourne, P., & Fink, J. (2008). I Am Not a Scientist, I Am a Number PLoS Computational Biology, 4 (12) DOI: 10.1371/journal.pcbi.1000247
  2. Various Publications about unique author identifiers bookmarked in citeulike
  3. Yours Truly (2009) Google thinks I’m Maurice Wilkins
  4. The Who (1978) Who Are You? Who, who, who, who? (Thanks to Jan Aerts for the reference!)

Michael Ley on Digital Bibliographies

Michael Ley

Michael Ley is visiting Manchester this week, he will be doing a seminar on Wednesday 3rd June, here are some details for anyone who is interested in attending:

Date: 3rd Jun 2009

Title: DBLP: How the data get in

Speaker: Dr Michael Ley. University of Trier, Germany

Time & Location: 14:15, Lecture Theatre 1.4, Kilburn Building

Abstract: The DBLP (Digital Bibliography & Library Project) Computer Science Bibliography now includes more than 1.2 million bibliographic records. For Computer Science researchers the DBLP web site now is a popular tool to trace the work of colleagues and to retrieve bibliographic details when composing the lists of references for new papers. Ranking and profiling of persons, institutions, journals, or conferences is another usage of DBLP. Many scientists are aware of this and want their publications being listed as complete as possible.

The talk focuses on the data acquisition workflow for DBLP. To get ‘clean’ basic bibliographic information for scientific publications remains a chaotic puzzle.

Large publishers are either not interested to cooperate with open services like DBLP, or their policy is very inconsistent. In most cases they are not able or not willing to deliver basic data required for DBLP in a direct way, but they encourage us to crawl their Web sites. This indirection has two main problems:

  1. The organisation and appearance of Web sites changes from time to time, this forces a reimplementation of information extraction scripts. [1]
  2. In many cases manual steps are necessary to get ‘complete’ bibliographic information.

For many small information sources it is not worthwhile to develop information extraction scripts. Data acquisition is done manually. There is an amazing variety of small but interesting journals, conferences and workshops in Computer Science which are not under the umbrella of ACM, IEEE, Springer, Elsevier etc. How they get it often is decided very pragmatically.

The goal of the talk and my visit to Manchester is to start a discussion process: The EasyChair conference management system developed by Andrei Voronkov and DBLP are parts of scientific publication workflow. They should be connected for mutual benefit?

References

  1. Lincoln Stein (2002). Creating a bioinformatics nation: screen scraping is torture Nature, 417 (6885), 119-120 DOI: 10.1038/417119a

March 16, 2009

March 12, 2009

Defrosting the Digital Seminar

The Lecture by James M ThorneCasey Bergman suggested it, Jean-Marc Schwartz organised it, so now I’m going to do it: a seminar on our Defrosting the Digital Library paper as part of the Bioinformatics and Functional Genomics seminar series. Here is the abstract of the talk:

After centuries with little change, scientific libraries have recently experienced massive upheaval. From being almost entirely paper-based, most libraries are now almost completely digital. This information revolution has all happened in less than 20 years and has created many novel opportunities and threats for scientists, publishers and libraries.

Today, we are struggling with an embarrassing wealth of digital knowledge on the Web. Most scientists access this knowledge through some kind of digital library, however these places can be cold, impersonal, isolated, and inaccessible places. Many libraries are still clinging to obsolete models of identity, attribution, contribution, citation and publication.

Based on a review published in PLoS Computational Biology, http://pubmed.gov/18974831 this talk will discuss the current chilly state of digital libraries for biologists, chemists and informaticians, including PubMed and Google Scholar. We highlight problems and solutions to the coupling and decoupling of publication data and metadata, with a tool called http://www.citeulike.org. This software tool exploits the Web to make digital libraries “warmer”: more personal, sociable, integrated, and accessible places.

Finally issues that will help or hinder the continued warming of libraries in the future, particularly the accurate identity of authors and their publications, are briefly introduced. These are discussed in the context of the BBSRC funded REFINE project, at the National Centre for Text Mining (NaCTeM.ac.uk), which is linking biochemical pathway data with evidence for pathways from the PubMed database.

Date: Monday 16th March 2008, Time: 12.00 midday, Location: Michael Smith Building, Main lecture theatre, Faculty of Life Sciences, University of Manchester (number 71 on google map of the Manchester campus). Please come along if you are interested…

[CC licensed picture above, "The Lecture" at Speakers Corner by James M Thorne]

February 20, 2009

Mistaken Identity: Google thinks I’m Maurice Wilkins

Who's afraid of Google?In a curious case of mistaken identity, Google seems to think I’m Maurice Wilkins. Here is how. If you Google the words DNA and mania (google.com/search?q=dna+mania) one of the first results is a tongue-in-cheek article I wrote two years ago about our obsession with Deoxyribonucleic Acid. Now Google (or more precisely Googlebot) seems to think this article is written by one M Wilkins. That’s M Wilkins as in the physicist Maurice Wilkins, the third man of the double helix (after Watson and Crick) and Nobel prize winner back in ’62. How could such a silly (but amusing) mistake be made? Because the article is about what Wilkins once said, but not actually by Wilkins. Computers can’t tell the difference between these two things. Consequently, it has been known for some time that Google Scholar has many other mistaken identities for authors like this. Scholar even thinks there is an author called Professor Forgotten Password (a prolific author who has been widely cited in many fields)!

The other curiosity is this, the original post on nodalpoint.org is also counted as a citation in Google Scholar too. It’s a bit of a mystery how scholar actually works, what it includes (and excludes) and how big it is, but you’ll find the article counted as a proper citation for a book about genes. Scientific spammers must be licking their lips with the opportunity to influence results and citation counts, with humble blog posts, rather than more kosher articles in peer-reviewed scientific journals.

So what does this all this curious interweb mischief tell us?

  1. Identifying people on the web is a tricky business, more complex than most people think
  2. Googlebot needs to have its algowithms tweaked by those Google Scholars at the Googleplex. Not really surprising, what else did you expect from Beta software? (P.S. Googlebot, when you read this, I’m not Maurice Wilkins, that’s not my name. I haven’t won a Nobel prize either.  I’m sort of flattered that you’ve mistaken me for such a distinguished scientist, so I’ll enjoy my alternative identity while it lasts.)
  3. Blogs are increasingly part of the scientific conversation, counted in various bibliometrics, will Google Scholar (and the rest) start indexing other blogs too? Where will this trend leave more conventional bibliometrics like the impact factor?

(Note: These search results were correct at the time of writing, but may change over time, results preserved for posterity on flickr)

References

  1. Maurice Wilkins (2003) The Third Man of the Double Helix: The Autobiography of Maurice Wilkins isbn:0198606656
  2. Péter Jacsó (2008) Savvy searching – Google Scholar revisited. Online Information Review 32: 102-11 DOI:10.1108/14684520810866010 (see also Defrosting the Digital Library)
  3. Douglas Kell (2008) What’s in a name? Guest, ghost and indeed quite imaginary authorships BBSRC blogs
  4. Neil R. Smalheiser and Vetle I. Torvik Author Name Disambiguation (This is a preprint version of a chapter published in Volume 43 (2009) of the Annual Review of Information Science and Technology (ARIST) (B. Cronin, Ed.) which is available from the publisher Information Today, Inc (http://books.infotoday.com/asist/#arist).
  5. Duncan Hull (2007) DNA mania. Nodalpoint.org
  6. Jules De Martino and Katie White (2008) That’s not my name (video)

October 31, 2008

Defrosting the Digital Library

Bibliographic Tools for the Next Generation Web

Sunset Ice Sculptures by Mark K.We started writing this paper [1] over a year ago, so it’s great to see it finally published today. Here is the abstract:

“Many scientists now manage the bulk of their bibliographic information electronically, thereby organizing their publications and citation material from digital libraries. However, a library has been described as “thought in cold storage,” and unfortunately many digital libraries can be cold, impersonal, isolated, and inaccessible places. In this Review, we discuss the current chilly state of digital libraries for the computational biologist, including PubMed, IEEE Xplore, the ACM digital library, ISI Web of Knowledge, Scopus, Citeseer, arXiv, DBLP, and Google Scholar. We illustrate the current process of using these libraries with a typical workflow, and highlight problems with managing data and metadata using URIs. We then examine a range of new applications such as Zotero, Mendeley, Mekentosj Papers, MyNCBI, CiteULike, Connotea, and HubMed that exploit the Web to make these digital libraries more personal, sociable, integrated, and accessible places. We conclude with how these applications may begin to help achieve a digital defrost, and discuss some of the issues that will help or hinder this in terms of making libraries on the Web warmer places in the future, becoming resources that are considerably more useful to both humans and machines.”

Biotechnology and Biological Sciences Research CouncilThanks to Kevin Emamy, Richard Cameron, Martin Flack, and Ian Mulvany for answering questions on the CiteULike and Connotea mailing lists; and Greg Tyrelle for ongoing discussion about metadata and the semantic Web nodalpoint.org. Also thanks to Timo Hannay and Tim O’Reilly for an invitation to scifoo, where some of the issues described in this publication were discussed. Last but not least, thanks to Douglas Kell and Steve Pettifer for helping me write it and the BBSRC for funding it (grant code BB/E004431/1 REFINE project). We hope it is a useful review, and that you enjoy reading it as much as we enjoyed writing it.

References

  1. Duncan Hull, Steve Pettifer and Douglas B. Kell (2008). Defrosting the digital library: Bibliographic tools for the next generation web. PLoS Computational Biology, 4(10):e1000204+. DOI:10.1371/journal.pcbi.1000204, pmid:18974831, pmcid:2568856, citeulike:3467077
  2. Also mentioned (in no particular order) by NCESS, Wowter, Twine, Stephen Abram, Rod Page, Digital Koans, Twitter, Bora Zivkovic, Digg, reddit, Library Intelligencer, OpenHelix, Delicious, friendfeed, Dr. Shock, GribbleLab, Nature Blogs, Ben Good, Rafael Sidi, Scholarship 2.0, Subio, up2date, SecondBrain, Hubmed, BusinessExchange, CiteGeist, Connotea and Google

[Sunrise Ice Sculptures picture from Mark K.]

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,526 other followers