Casey Bergman | O'Really?

September 9, 2014

Punning with the Pub in PubMed: Are there any decent NCBI puns left? #PubMedPuns

Filed under: data mining,Googleology,paperware,publishing,Science,technology — Duncan Hull @ 10:31 am
Tags: Casey Bergman, defrosting, Elizabeth Gibney, gastropub, google scholar, Johanna McEntyre, Karsten Hokamp, Ken Wolfe, Leo Chalupa, Mark Gerstein, Phil Bourne, portmanteau, pubbit, Pubble, PubBrawl, pubby, PubCast, pubchase, pubclean, PubCrawl, pubcrawler, pubfetch, PubFig, PubFight, pubgames, publican, PubLick, PubLons, publunch, Publy, PubManteau, PubMatch, pubmed, PubMedication, PubMine, pubnet, pubpeer, PubQuiz, PubSCIENCE, pubsearch, PubSnacks, PubSnax, PubSoft, PubSort, Pubsy, Richard van Noorden, RSS, text mining, twitter, twitterbot

PubMedication: do you get your best ideas in the Pub? CC-BY-ND image via trombone65 on Flickr.

Many people claim they get all their best ideas in the pub, but for lots of scientists their best ideas probably come from PubMed.gov – the NCBI’s monster database of biomedical literature. Consequently, the database has spawned a whole slew of tools that riff off the PubMed name, with many puns and portmanteaus (aka “PubManteaus”), and the pub-based wordplays are very common. [1,2]

All of this might make you wonder, are there any decent PubMed puns left? Here’s an incomplete collection:

PubCrawler pubcrawler.ie “goes to the library while you go to the pub…” [3,4]
PubChase pubchase.com is a “life sciences and medical literature recommendations engine. Search smarter, organize, and discover the articles most important to you.” [5]
PubCast scivee.tv/pubcasts allow users to “enliven articles and help drive more views” (to PubMed) [6]
PubFig nothing to do with PubMed, but research done on face and image recognition that happens to be indexed by PubMed. [7]
PubGet pubget.com is a “comprehensive source for science PDFs, including everything you’d find in Medline.” [8]
PubLons publons.com OK, not much to do with PubMed directly but PubLons helps you “you record, showcase, and verify all your peer review activity.”
PubMine “supports intelligent knowledge discovery” [9]
PubNet pubnet.gersteinlab.org is a “web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks” aka a publication network graph utility. [10]
GastroPub repackages and re-sells ordinary PubMed content disguised as high-end luxury data at a higher premium, similar to a Gastropub.
PubQuiz is either the new name for NCBI database search www.ncbi.nlm.nih.gov/gquery or a quiz where you’re only allowed to use PubMed to answer questions.
PubSearch & PubFetch allows users to “store literature, keyword, and gene information in a relational database, index the literature with keywords and gene names, and provide a Web user interface for annotating the genes from experimental data found in the associated literature” [11]
PubScience is either “peer-reviewed drinking” courtesy of pubsci.co.uk or an ambitious publishing project tragically axed by the U.S. Department of Energy (DoE). [12,13]
PubSub is anything that makes use of the publish–subscribe pattern, such as NCBI feeds. [14]
PubLick as far as I can see, hasn’t been used yet, unless you count this @publick on twitter. If anyone was launching a startup, working in the area of “licking” the tastiest data out of PubMed, that could be a great name for their data-mining business. Alternatively, it could be a catchy new nickname for PubMedCentral (PMC) or Europe PubMedCentral (EuropePMC) [15] – names which don’t exactly trip off the tongue. Since PMC is a free digital archive of publicly accessible full-text scholarly articles, PubLick seems like a appropriate moniker.

PubLick Cat got all the PubMed cream. CC-BY image via dizznbonn on flickr.

There’s probably lots more PubMed puns and portmanteaus out there just waiting to be used. Pubby, Pubsy, PubLican, Pubble, Pubbit, Publy, PubSoft, PubSort, PubBrawl, PubMatch, PubGames, PubGuide, PubWisdom, PubTalk, PubChat, PubShare, PubGrub, PubSnacks and PubLunch could all work. If you’ve know of any other decent (or dodgy) PubMed puns, leave them in the comments below and go and build a scientific twitterbot or cool tool using the same name — if you haven’t already.

References

Lu Z. (2011). PubMed and beyond: a survey of web tools for searching biomedical literature., Database: The Journal of Biological Databases and Curation, http://pubmed.gov/21245076
Hull D., Pettifer S.R. & Kell D.B. (2008). Defrosting the digital library: bibliographic tools for the next generation web., PLOS Computational Biology, PMID: http://pubmed.gov/18974831
Hokamp K. & Wolfe K.H. (2004) PubCrawler: keeping up comfortably with PubMed and GenBank., Nucleic acids research, http://pubmed.gov/15215341
Hokamp K. & Wolfe K. (1999) What’s new in the library? What’s new in GenBank? let PubCrawler tell you., Trends in Genetics, http://pubmed.gov/10529811
Gibney E. (2014). How to tame the flood of literature., Nature, 513 (7516) http://pubmed.gov/25186906
Bourne P. & Chalupa L. (2008). A new approach to scientific dissemination, Materials Today, 11 (6) 48-48. DOI:10.1016/s1369-7021(08)70131-7
Kumar N., Berg A., Belhumeur P.N. & Nayar S. (2011). Describable Visual Attributes for Face Verification and Image Search., IEEE Transactions on Pattern Analysis and Machine Intelligence, http://pubmed.gov/21383395
Featherstone R. & Hersey D. (2010). The quest for full text: an in-depth examination of Pubget for medical searchers., Medical Reference Services Quarterly, 29 (4) 307-319. http://pubmed.gov/21058175
Kim T.K., Wan-Sup Cho, Gun Hwan Ko, Sanghyuk Lee & Bo Kyeng Hou (2011). PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities, Interdisciplinary Bio Central, 3 (2) 1-6. DOI:10.4051/ibc.2011.3.2.0007
Douglas S.M., Montelione G.T. & Gerstein M. (2005). PubNet: a flexible system for visualizing literature derived networks., Genome Biology, http://pubmed.gov/16168087
Yoo D., Xu I., Berardini T.Z., Rhee S.Y., Narayanasamy V. & Twigger S. (2006). PubSearch and PubFetch: a simple management system for semiautomated retrieval and annotation of biological information from the literature., Current Protocols in Bioinformatics , http://pubmed.gov/18428773
Seife C. (2002). Electronic publishing. DOE cites competition in killing PubSCIENCE., Science (New York, N.Y.), 297 (5585) 1257-1259. http://pubmed.gov/12193762
Jensen M. (2003). Another loss in the privatisation war: PubScience., Lancet, 361 (9354) 274. http://pubmed.gov/12559859
Dubuque E.M. (2011). Automating academic literature searches with RSS Feeds and Google Reader(™)., Behavior Analysis in Practice, 4 (1) http://pubmed.gov/22532905
McEntyre J.R., Ananiadou S., Andrews S., Black W.J., Boulderstone R., Buttery P., Chaplin D., Chevuru S., Cobley N. & Coleman L.A. & (2010). UKPMC: a full text article resource for the life sciences., Nucleic Acids Research, http://pubmed.gov/21062818

http://twitter.com/Richvn/status/509370496375619584

@dullhunk @McDawg @PubChase @Publick @pubget @Richvn @LizzieGibney @NatureNews nobody written PubCrawl yet? Shame

— Bob O'Hara (@BobOHara) September 9, 2014

http://twitter.com/i_am_kilpatrick/status/509275423415738368

February 15, 2012

The Open Access Irony Awards: Naming and shaming them

Filed under: data mining,publishing,Science,technology — Duncan Hull @ 11:23 am
Tags: BioMed Central, Bruce Alberts, cameron neylon, Casey Bergman, citeulike, DOAJ, Eefke Smit, Elias Zerhouni, Elsevier, Evilsevier, flickr, foldit, hypocrisy, irony, Jihyun Kim, Jocelyn Kaiser, Joe Dunckley, Jonathan Eisen, Josh Sommer, Keith Epstein, Macmillan, Mark Wolpert, Matthew Cockerill, Mendeley, NIH, Open Access, PLoS, Sage Bionetworks, Springer, Stephen Curry, Stephen Friend, Wiley-Blackwell

Open Access (OA) publishing aims to make the results of scientific research available to the widest possible audience. Scientific papers that are published in Open Access journals are freely available for crucial data mining and for anyone or anything to read, wherever they may be.

In the last ten years, the Open Access movement has made huge progress in allowing:

“any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers.”

But there is still a long way to go yet, as much of the world’s scientific knowledge remains locked up behind publisher’s paywalls, unavailable for re-use by text-mining software and inaccessible to the public, who often funded the research through taxation.

Openly ironic?

Ironically, some of the papers that are inaccessible discuss or even champion the very Open Access movement itself. Sometimes the lack of access is deliberate, other times accidental – but the consequences are serious. Whether deliberate or accidental, restricted access to public scientific knowledge is slowing scientific progress [1]. Sometimes the best way to make a serious point is to have a laugh and joke about it. This is what the Open Access Irony Awards do, by gathering all the offenders in one place, we can laugh and make a serious point at the same time by naming and shaming the papers in question.

To get the ball rolling, here is are some examples:

The Lancet owned by Evilsevier, sorry I mean Elsevier, recently published a paper on “the case for open data” [2] (please login to access article). Login?! Not very open…
Serial offender and über-journal Science has an article by Elias Zerhouni on the NIH public access policy [3] (Subscribe/Join AAAS to View Full Text), another on “making data maximally available” [4] (Subscribe/Join AAAS to View Full Text) and another on a high profile advocate of open science [5] (Buy Access to This Article to View Full Text) Irony of ironies.
From Nature Publishing Group comes a fascinating paper about harnessing the wisdom of the crowds to predict protein structures [6]. Not only have members of the tax-paying public funded this work, they actually did some of the work too! But unfortunately they have to pay to see the paper describing their results. Ironic? Also, another published in Nature Medicine proclaims the “delay in sharing research data is costing lives” [1] (instant access only $32!)
From the British Medical Journal (BMJ) comes the worrying news of dodgy American laws that will lock up valuable scientific data behind paywalls [7] (please subscribe or pay below). Ironic? *
The “green” road to Open Access publishing involves authors uploading their manuscript to self-archive the data in some kind of public repository. But there are many social, political and technical barriers to this, and they have been well documented [8]. You could find out about them in this paper [8], but it appears that the author hasn’t self-archived the paper or taken the “gold” road and pulished in an Open Access journal. Ironic?
Last, but not least, it would be interesting to know what commercial publishers make of all this text-mining magic in Science [9], but we would have to pay $24 to find out. Ironic?

These are just a small selection from amongst many. If you would like to nominate a paper for an Open Access Irony Award, simply post it to the group on Citeulike or group on Mendeley. Please feel free to start your own group elsewhere if you’re not on Citeulike or Mendeley. The name of this award probably originated from an idea Jonathan Eisen, picked up by Joe Dunckley and Matthew Cockerill at BioMed Central (see tweet below). So thanks to them for the inspiration.

"The delay in sharing research data is costing lives" @NatureMedicine must win #oa irony award (via @ste http://twitpic.com/297m0v

— Matthew Cockerill (@opentechmatt) July 27, 2010

For added ironic amusement, take a screenshot of the offending article and post it to the Flickr group. Sometimes the shame is too much, and articles are retrospectively made open access so a screenshot will preserve the irony.

Join us in poking fun at the crazy business of academic publishing, while making a serious point about the lack of Open Access to scientific data.

References

Sommer, Josh (2010). The delay in sharing research data is costing lives Nature Medicine, 16 (7), 744-744 DOI: 10.1038/nm0710-744
Boulton, G., Rawlins, M., Vallance, P., & Walport, M. (2011). Science as a public enterprise: the case for open data The Lancet, 377 (9778), 1633-1635 DOI: 10.1016/S0140-6736(11)60647-8
Zerhouni, Elias (2004). Information Access: NIH Public Access Policy Science, 306 (5703), 1895-1895 DOI: 10.1126/science.1106929
Hanson, B., Sugden, A., & Alberts, B. (2011). Making Data Maximally Available Science, 331 (6018), 649-649 DOI: 10.1126/science.1203354
Kaiser, Jocelyn (2012). Profile of Stephen Friend at Sage Bionetworks: The Visionary Science, 335 (6069), 651-653 DOI: 10.1126/science.335.6069.651
Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popović, Z., & players, F. (2010). Predicting protein structures with a multiplayer online game Nature, 466 (7307), 756-760 DOI: 10.1038/nature09304
Epstein, Keith (2012). Scientists are urged to oppose new US legislation that will put studies behind a pay wall BMJ, 344 (jan17 3) DOI: 10.1136/bmj.e452
Kim, Jihyun (2010). Faculty self-archiving: Motivations and barriers Journal of the American Society for Information Science and Technology DOI: 10.1002/asi.21336
Smit, Eefke, & Van Der Graaf, M. (2012). Journal article mining: the scholarly publishers’ perspective Learned Publishing, 25 (1), 35-46 DOI: 10.1087/20120106

[CC licensed picture “ask me about open access” by mollyali.]

* Please note, some research articles in BMJ are available by Open Access, but news articles like [7] are not. Thanks to Trish Groves at BMJ for bringing this to my attention after this blog post was published. Also, some “articles” here are in a grey area for open access, particularly “journalistic” stuff like news, editorials and correspondence, as pointed out by Becky Furlong. See tweets below…

MT “@dullhunk Open Access Irony Awards: http://t.co/VlBS4eZ4 @bmcmatt @phylogenomics @PLoS @steinsky #openaccess” <- #BMJ research is OA!

— Trish Groves (@trished) February 19, 2012

@dullhunk I entirely agree with concept of #OA irony but many of the examples were journalistic – perhaps a bit unfair?

— Becky Furlong (@becky_furlong) February 20, 2012

@dullhunk @becky_furlong @trished just don't see why any journal would *choose* to put an editorial about increased access behind a paywall

— Matthew Cockerill (@opentechmatt) February 21, 2012

@dullhunk @becky_furlong @trished it's not that it's inconsistent or morally wrong or anything… It's just, er, ironic…

— Matthew Cockerill (@opentechmatt) February 21, 2012

Comments (23)

March 16, 2009

Defrosting the Digital Slideshow

Filed under: biotech,communication,informatics — Duncan Hull @ 3:14 pm
Tags: 2Collab, BBC Monitoring, bibtex, biochemistry, bioinformatics, Casey Bergman, ChEBI, cheminformatics, citeulike, connotea, CSW Informatics Ltd, database, endnote, Ford, google scholar, identity, Institutional Repository, John Chelsom, library, Mavis Cournane, Mekentosj Papers, Mendeley, metadata, Neil Smalheiser, openid, Papyro, Peter Murray-Rust, pubmed, refworks, scopus, text mining, Vetle Torvik

Slides from the seminar today, for those that asked for them. Thanks to everyone who came, we had a good turn out, much better than expected.

Those Library and Institutional Repository people have asked for an encore too…

Comments (2)

March 12, 2009

Defrosting the Digital Seminar

Filed under: bio,biotech — Duncan Hull @ 8:37 am
Tags: bbsrc, bioinformatics, Casey Bergman, citeulike, google scholar, Jean-Marc Schwartz, Lecture, life sciences, nactem, pubmed, REFINE, seminar, text mining, University of Manchester

Casey Bergman suggested it, Jean-Marc Schwartz organised it, so now I’m going to do it: a seminar on our Defrosting the Digital Library paper as part of the Bioinformatics and Functional Genomics seminar series. Here is the abstract of the talk:

After centuries with little change, scientific libraries have recently experienced massive upheaval. From being almost entirely paper-based, most libraries are now almost completely digital. This information revolution has all happened in less than 20 years and has created many novel opportunities and threats for scientists, publishers and libraries.

Today, we are struggling with an embarrassing wealth of digital knowledge on the Web. Most scientists access this knowledge through some kind of digital library, however these places can be cold, impersonal, isolated, and inaccessible places. Many libraries are still clinging to obsolete models of identity, attribution, contribution, citation and publication.

Based on a review published in PLoS Computational Biology, http://pubmed.gov/18974831 this talk will discuss the current chilly state of digital libraries for biologists, chemists and informaticians, including PubMed and Google Scholar. We highlight problems and solutions to the coupling and decoupling of publication data and metadata, with a tool called http://www.citeulike.org. This software tool exploits the Web to make digital libraries “warmer”: more personal, sociable, integrated, and accessible places.

Finally issues that will help or hinder the continued warming of libraries in the future, particularly the accurate identity of authors and their publications, are briefly introduced. These are discussed in the context of the BBSRC funded REFINE project, at the National Centre for Text Mining (NaCTeM.ac.uk), which is linking biochemical pathway data with evidence for pathways from the PubMed database.

Date: Monday 16th March 2008, Time: 12.00 midday, Location: Michael Smith Building, Main lecture theatre, Faculty of Life Sciences, University of Manchester (number 71 on google map of the Manchester campus). Please come along if you are interested…

[CC licensed picture above, “The Lecture” at Speakers Corner by James M Thorne]

Comments (2)

O'Really?

September 9, 2014

Punning with the Pub in PubMed: Are there any decent NCBI puns left? #PubMedPuns

References

February 15, 2012

The Open Access Irony Awards: Naming and shaming them

Openly ironic?

March 16, 2009

Defrosting the Digital Slideshow

March 12, 2009

Defrosting the Digital Seminar

Meta / μετά