Via the Official Google Research Blog at the University of Google, Alon Halevy, Peter Norvig and Fernando Pereira have published an interesting expert opinion piece in the March/April 2009 edition of IEEE Intelligent Systems: computer.org/intelligent. The paper talks about embracing complexity and making use of the “the unreasonable effectiveness of data” [1] drawing analogies with the “unreasonable effectiveness of mathematics” [2]. There is plenty to agree and disagree with in this provocative article which makes it an entertaining read. So what can we learn from those expert Googlers in the Googleplex? (more…)
April 17, 2009
The Unreasonable Effectiveness of Google
April 9, 2009
April 6, 2009
Should We Boycott Amazon (again)?
My first proper full-time job was working in the big bad world of scientific publishing for a family run company based in Oxford called Blackwell Science Limited, or blacksci.co.uk which is now part of wiley.com. Consequently, I’ve a few friends and former colleagues who still work in various parts of the publishing industry. Last week I got an email from one of these friends who works for a small independent book publishing company: I’ve reproduced an interesting email message about Amazon from them below (with permission):
This is very unlike me but I am sending a general email out because I am so outraged by something I feel I must share with you. In case you didn’t already know, I work for a small publisher. Times are hard – we all know that. Amazon.co.uk form a large part of our business. Recently they have changed their terms with all of their publishers. For us, and many other small and independent publishers, these new terms are completely unacceptable. We have no say about it and the way they went about it was frankly nasty (they basically sent an email out giving us a week to decide whether to give them more discount or more credit). For bigger publishers it may have a negligible effect but for smaller publishers, where cashflow can mean everything, the effect will be severe! And they have us over a barrel.
Amazon.co.uk so dominate the online market in books that they are almost a monopoly. The discounts we’ve been supplying Amazon for the last few years are outrageous – but what they have done recently is the last straw, and many small publishers could go out of business (luckily I think we’ll survive!). I am so outraged at how they are treating their suppliers that I am now boycotting Amazon for my own personal books and CDs. I have been using them for years and years. The only way to put a bit of healthy competition back into the system is by having more online book retailers become as successful as Amazon. Today we used The Book Depository bookdepository.co.uk for the first time. The books we wanted were all there, in stock and cheaper than Amazon and it was very easy to use. So we’re trying to help spread the word!
Another online retailer is waterstones.com, which separated from Amazon a few years ago due to their unworkable terms. I haven’t used them myself but I hear they are pretty good, and play.com can fulfil your DVD and CD requirements (and all delivery is free I think).
They may not always be as cheap as Amazon but now you know how Amazon get their low prices you may not be as happy to use them – if small, interesting, independent publishers go out of business it’ll just be the biggies left (which will mean much less choice).
So, is the behaviour of Amazon.co.uk just the all too familiar face of capitalism? Or should we boycott Amazon for being a big bully only interested in monopolising the marketplace and getting rid of some healthy competition?
References
- Catherine Neilan (2009) Amazon refused to budge on new terms, Bookseller.com 2009-03-30
- Liz Thomson (2009) Advantage Amazon? Publishers react to proposed new terms Bookbrunch.co.uk 2009-03-26
- Richard Stalman (2001) (Formerly) Boycott Amazon! – GNU Project – Free Software Foundation (FSF) gnu.org
April 2, 2009
March 16, 2009
March 12, 2009
Defrosting the Digital Seminar
Casey Bergman suggested it, Jean-Marc Schwartz organised it, so now I’m going to do it: a seminar on our Defrosting the Digital Library paper as part of the Bioinformatics and Functional Genomics seminar series. Here is the abstract of the talk:
After centuries with little change, scientific libraries have recently experienced massive upheaval. From being almost entirely paper-based, most libraries are now almost completely digital. This information revolution has all happened in less than 20 years and has created many novel opportunities and threats for scientists, publishers and libraries.
Today, we are struggling with an embarrassing wealth of digital knowledge on the Web. Most scientists access this knowledge through some kind of digital library, however these places can be cold, impersonal, isolated, and inaccessible places. Many libraries are still clinging to obsolete models of identity, attribution, contribution, citation and publication.
Based on a review published in PLoS Computational Biology, http://pubmed.gov/18974831 this talk will discuss the current chilly state of digital libraries for biologists, chemists and informaticians, including PubMed and Google Scholar. We highlight problems and solutions to the coupling and decoupling of publication data and metadata, with a tool called http://www.citeulike.org. This software tool exploits the Web to make digital libraries “warmer”: more personal, sociable, integrated, and accessible places.
Finally issues that will help or hinder the continued warming of libraries in the future, particularly the accurate identity of authors and their publications, are briefly introduced. These are discussed in the context of the BBSRC funded REFINE project, at the National Centre for Text Mining (NaCTeM.ac.uk), which is linking biochemical pathway data with evidence for pathways from the PubMed database.
Date: Monday 16th March 2008, Time: 12.00 midday, Location: Michael Smith Building, Main lecture theatre, Faculty of Life Sciences, University of Manchester (number 71 on google map of the Manchester campus). Please come along if you are interested…
[CC licensed picture above, “The Lecture” at Speakers Corner by James M Thorne]
February 25, 2009
A Fistful Of Papers: Journal Club for Gunslingers
A Fistful of Papers is a Journal Club with a simple recipe
- We pick interesting papers
- We read them
- We periodically meet to discuss said papers in the pub local saloon
It’s all good fun, if you’d like to join us, details of the next gathering on Friday 27th February, can be found over at fistful.wordpress.com (Journal Club for Gunslingers).
[Clint Eastwood picture by Lego Man Andrew Becraft a.k.a. Dunechaser]
February 20, 2009
Mistaken Identity: Google thinks I’m Maurice Wilkins
In a curious case of mistaken identity, Google seems to think I’m Maurice Wilkins. Here is how. If you Google the words DNA and mania (google.com/search?q=dna+mania) one of the first results is a tongue-in-cheek article I wrote two years ago about our obsession with Deoxyribonucleic Acid. Now Google (or more precisely Googlebot) seems to think this article is written by one M Wilkins. That’s M Wilkins as in the physicist Maurice Wilkins, the third man of the double helix (after Watson and Crick) and Nobel prize winner back in ’62. How could such a silly (but amusing) mistake be made? Because the article is about what Wilkins once said, but not actually by Wilkins. Computers can’t tell the difference between these two things. Consequently, it has been known for some time that Google Scholar has many other mistaken identities for authors like this. Scholar even thinks there is an author called Professor Forgotten Password (a prolific author who has been widely cited in many fields)!
The other curiosity is this, the original post on nodalpoint.org is also counted as a citation in Google Scholar too. It’s a bit of a mystery how scholar actually works, what it includes (and excludes) and how big it is, but you’ll find the article counted as a proper citation for a book about genes. Scientific spammers must be licking their lips with the opportunity to influence results and citation counts, with humble blog posts, rather than more kosher articles in peer-reviewed scientific journals.
So what does this all this curious interweb mischief tell us?
- Identifying people on the web is a tricky business, more complex than most people think
- Googlebot needs to have its algowithms tweaked by those Google Scholars at the Googleplex. Not really surprising, what else did you expect from Beta software? (P.S. Googlebot, when you read this, I’m not Maurice Wilkins, that’s not my name. I haven’t won a Nobel prize either. I’m sort of flattered that you’ve mistaken me for such a distinguished scientist, so I’ll enjoy my alternative identity while it lasts.)
- Blogs are increasingly part of the scientific conversation, counted in various bibliometrics, will Google Scholar (and the rest) start indexing other blogs too? Where will this trend leave more conventional bibliometrics like the impact factor?
(Note: These search results were correct at the time of writing, but may change over time, results preserved for posterity on flickr)
References
- Maurice Wilkins (2003) The Third Man of the Double Helix: The Autobiography of Maurice Wilkins isbn:0198606656
- Péter Jacsó (2008) Savvy searching – Google Scholar revisited. Online Information Review 32: 102-11 DOI:10.1108/14684520810866010 (see also Defrosting the Digital Library)
- Douglas Kell (2008) What’s in a name? Guest, ghost and indeed quite imaginary authorships BBSRC blogs
- Neil R. Smalheiser and Vetle I. Torvik Author Name Disambiguation (This is a preprint version of a chapter published in Volume 43 (2009) of the Annual Review of Information Science and Technology (ARIST) (B. Cronin, Ed.) which is available from the publisher Information Today, Inc (http://books.infotoday.com/asist/#arist).
- Duncan Hull (2007) DNA mania. Nodalpoint.org
- Jules De Martino and Katie White (2008) That’s not my name (video)
February 11, 2009
Janet Street-Porter on the Internet Revolution
I’m not much of a fan of Janet Street-Porter, neither am I a regular viewer of the BBC Money programme but right now they are screening an interesting series of three half-hour programmes on the impact of the internet on newspapers, books and television. It’s a familiar tale of the power-and-money struggle between old media and new media that, if the first programme is anything to go by, is worth watching. Here is the blurb from the first episode in the series, billed as Media Revolution: Stop Press?
Former national newspaper editor Janet Street-Porter investigates how papers are coping with falling circulation, advertising revenues and the growth of the internet, and asks if newspapers can survive in their current form. In her quest to discover what the future holds for her beloved newspapers, Janet visits newsrooms, printing plants and even spends a morning as a papergirl. With contributions from national editors, advertising gurus and a rare interview with media mogul Rupert Murdoch, Janet examines if papers can survive as new multimedia information giants.
There are some interesting parallels between the changes described in this programme, and scientific media, especially the scientific journal publishing racket.
Scientific Media Revolution?
The story of the current revolution in scientific and technical publishing is perhaps just as interesting (and more important) than the one being told on the money programme. Just think of it, why scientists publish, the emergence of peer review, how Robert Maxwell made his fortune from the Pergamon Press, the impact factor game, the birth of the Web (in a scientific laboratory), the growth of Google, the copyright wars, open-access publishing, social software, the rise and fall of publishing empires (and technology companies), the vanity journals, scientific blogs and wikis, software showdowns, how all this change affects producers and consumers of science and technology, both now and in the future. A juicy subject, worthy of broadcasting on any media (old or new). You would need a lot more than three half-hour programmes to cover this particular ongoing epic, so who is going to tell that story?
Anyway, the series is worth a look (if you haven’t already seen it) at least according to me (others disagree see also no paper is the future). It is also available on iPlayer for up to a week after first broadcast – Thursday 5th, 12th and 19th February 2008 – for each episode in the UK only, unless you go through some kind of proxy.
February 6, 2009
The Loneliness of the Long Distance Researcher
Despite what some people think (see “the myth of the lone inventor” in [1]) most scientists are usually pretty sociable people. Science is an inherently social activity [2], just take a look around you. Most laboratories are full of like-minded people working on related problems, our lab is no exception. Outside the lab, there are all the conferences, workshops, seminars, trips to the pub, coffee breaks and other meetings where scientists meet and exchange ideas and results. Finally, note the peer in peer-review – another essentially social activity, even when it is anonymous.
But in between these gregarious social activities there is a long, lonely and pretty unsociable road where you need to spend lots of time thinking, reading, writing and experimenting. Essentially you are alone, like a modern day hermit, especially at the earlier stages of a career. Solitary confinement in your ivory tower of choice needs to be balanced with various kinds of socialising. Talking about and watching what other people are doing, as well as publicising your own work are an essential part of the mix. But you still need to put the hours in on the road. It isn’t always easy to get it right, so how do you strike a balance between the social and the solitary activities to establish yourself as an independent research scientist? (more…)

