interweb | O'Really?

February 20, 2009

Mistaken Identity: Google thinks I’m Maurice Wilkins

Filed under: funny,google,informatics — Duncan Hull @ 8:35 am
Tags: algowithm, Anurag Acharya, bibliometrics, DNA mania, Double Helix, forgotten password, google scholar, googlebot, identity, impact factor, interweb, Jules De Martino, Katie White, maurice wilkins, Neil Smalheiser, nobel, nodalpoint, Péter Jacsó, The Ting Tings, Vetle Torvik

In a curious case of mistaken identity, Google seems to think I’m Maurice Wilkins. Here is how. If you Google the words DNA and mania (google.com/search?q=dna+mania) one of the first results is a tongue-in-cheek article I wrote two years ago about our obsession with Deoxyribonucleic Acid. Now Google (or more precisely Googlebot) seems to think this article is written by one M Wilkins. That’s M Wilkins as in the physicist Maurice Wilkins, the third man of the double helix (after Watson and Crick) and Nobel prize winner back in ’62. How could such a silly (but amusing) mistake be made? Because the article is about what Wilkins once said, but not actually by Wilkins. Computers can’t tell the difference between these two things. Consequently, it has been known for some time that Google Scholar has many other mistaken identities for authors like this. Scholar even thinks there is an author called Professor Forgotten Password (a prolific author who has been widely cited in many fields)!

The other curiosity is this, the original post on nodalpoint.org is also counted as a citation in Google Scholar too. It’s a bit of a mystery how scholar actually works, what it includes (and excludes) and how big it is, but you’ll find the article counted as a proper citation for a book about genes. Scientific spammers must be licking their lips with the opportunity to influence results and citation counts, with humble blog posts, rather than more kosher articles in peer-reviewed scientific journals.

So what does this all this curious interweb mischief tell us?

Identifying people on the web is a tricky business, more complex than most people think
Googlebot needs to have its algowithms tweaked by those Google Scholars at the Googleplex. Not really surprising, what else did you expect from Beta software? (P.S. Googlebot, when you read this, I’m not Maurice Wilkins, that’s not my name. I haven’t won a Nobel prize either. I’m sort of flattered that you’ve mistaken me for such a distinguished scientist, so I’ll enjoy my alternative identity while it lasts.)
Blogs are increasingly part of the scientific conversation, counted in various bibliometrics, will Google Scholar (and the rest) start indexing other blogs too? Where will this trend leave more conventional bibliometrics like the impact factor?

(Note: These search results were correct at the time of writing, but may change over time, results preserved for posterity on flickr)

References

Maurice Wilkins (2003) The Third Man of the Double Helix: The Autobiography of Maurice Wilkins isbn:0198606656
Péter Jacsó (2008) Savvy searching – Google Scholar revisited. Online Information Review 32: 102-11 DOI:10.1108/14684520810866010 (see also Defrosting the Digital Library)
Douglas Kell (2008) What’s in a name? Guest, ghost and indeed quite imaginary authorships BBSRC blogs
Neil R. Smalheiser and Vetle I. Torvik Author Name Disambiguation (This is a preprint version of a chapter published in Volume 43 (2009) of the Annual Review of Information Science and Technology (ARIST) (B. Cronin, Ed.) which is available from the publisher Information Today, Inc (http://books.infotoday.com/asist/#arist).
Duncan Hull (2007) DNA mania. Nodalpoint.org
Jules De Martino and Katie White (2008) That’s not my name (video)

Comments (4)

February 11, 2009

Janet Street-Porter on the Internet Revolution

Filed under: publishing,Science — Duncan Hull @ 8:40 am
Tags: BBC2, Copyright, interweb, iPlayer, Janet Street-Porter, new media, old media, Open Access, peer review, Pergamon Press, revolution, Robert Maxwell, Rupert Murdoch, vanity journals

I’m not much of a fan of Janet Street-Porter, neither am I a regular viewer of the BBC Money programme but right now they are screening an interesting series of three half-hour programmes on the impact of the internet on newspapers, books and television. It’s a familiar tale of the power-and-money struggle between old media and new media that, if the first programme is anything to go by, is worth watching. Here is the blurb from the first episode in the series, billed as Media Revolution: Stop Press?

Former national newspaper editor Janet Street-Porter investigates how papers are coping with falling circulation, advertising revenues and the growth of the internet, and asks if newspapers can survive in their current form. In her quest to discover what the future holds for her beloved newspapers, Janet visits newsrooms, printing plants and even spends a morning as a papergirl. With contributions from national editors, advertising gurus and a rare interview with media mogul Rupert Murdoch, Janet examines if papers can survive as new multimedia information giants.

There are some interesting parallels between the changes described in this programme, and scientific media, especially the scientific journal publishing racket.

Scientific Media Revolution?

The story of the current revolution in scientific and technical publishing is perhaps just as interesting (and more important) than the one being told on the money programme. Just think of it, why scientists publish, the emergence of peer review, how Robert Maxwell made his fortune from the Pergamon Press, the impact factor game, the birth of the Web (in a scientific laboratory), the growth of Google, the copyright wars, open-access publishing, social software, the rise and fall of publishing empires (and technology companies), the vanity journals, scientific blogs and wikis, software showdowns, how all this change affects producers and consumers of science and technology, both now and in the future. A juicy subject, worthy of broadcasting on any media (old or new). You would need a lot more than three half-hour programmes to cover this particular ongoing epic, so who is going to tell that story?

Anyway, the series is worth a look (if you haven’t already seen it) at least according to me (others disagree see also no paper is the future). It is also available on iPlayer for up to a week after first broadcast – Thursday 5th, 12th and 19th February 2008 – for each episode in the UK only, unless you go through some kind of proxy.