Sunday, October 20, 2013

Professor for one year (week 23): Repositories

In today's scientific community, it's all about publications, journals, and impact factors.  So you not only publish your research in a high-ranked journal or via a high-ranked conference, but you keep track of your publications, make lists, allow others to access your publications, and hope to get cited.

For me, the best place to store and maintain publications is CiteULike:

I have full control of what is listed here, I can store full papers, and I can extract the publication meta data as BibTeX data.  This also means that available categories are consistent with usual BibTeX categories, i.e., article in proceedings, book, edited book, book chapter, thesis, technical report, etc.

CiteULike is also the place I store all bibliographic information of papers and books I read (or I want to read) and the papers itself.  I put quite some effort in this: if the paper is available online, I upload it and add the link, if it's from a paper-only publication, I scan it and then store it there.  So for almost all of the seven hundred and something papers in my library, I can access the paper itself.  I add abstracts and keywords.  I can export the meta data as BibTeX and thus use it for referencing stuff when writing an article myself.  It's consistent and up-to-date.  There is very convenient feature: the Post-to-CiteULike button you can add to your browser.  In most of the cases, I can add a reference by using this button and then maybe correcting some information.

The only drawback with CiteULike I encountered over the years:  I cannot download all the stored papers at once.  It would be very convenient to have off-line access to all data when writing an article -- however, abstract and keywords are included in the BibTeX export, this already is quite nice.

I also have a ResearchGate account I try to maintain:

There are some odd things with ResearchGate: Look at the very first entry, it refers to the proceedings of SFCM 2013.  I'm one of the editors of this book.  Yes, it's an edited book and/or "conference proceedings (whole)."  However, ResearchGate lacks both of these categories: if you choose "book", you have to input "authors."  After exchanging quite some e-mails with them, they were finally able to list the publication at all.  But it now has authors and editors and those are identical.  Very annoying.

Next, ResearchGate tries to find your publications in the Web.  When creating an account, this is somewhat convenient, you don't have to input all your publications by hand -- uploading a valid BibTeX file as exported from CiteULike doesn't work properly.  However, as publications are listed at various places, sometimes with different (wrong) meta data, it also finds incorrect data.  And it comes up again and again and again proposing to add this data -- it's impossible to stop this.  Oh, and the number of citations is wrong.

However, it's a nice place to be informed about recent publications of colleagues.  I also had an eye-opener some time ago: When you list the people you follow, their names, ResearchGate score, and impact factor is listed.  I follow some linguists, some computer scientists, some psychologists, and some computational linguists.  And it's very easy to recognize the computational linguists -- they have very low impact factors or none at all.  A very nice empirical confirmation that journals (and moreover: indexed journals) don't play a big role in computational linguistics.

I also have a Google Scholar account I try to maintain:

Google Scholar doesn't come up with suggestions to include incorrect data and I can add information myself.  The citation number seems to be somewhat realistic, although you never know about the missing citations (except your self-citations).

I also have an account with Microsoft Academic Search I try to maintain:

Here, incorrect information is included -- I'm not into Chemistry, although I did my secondary school written exam in Chemistry, but how should Microsoft know about this?  You can try to edit and add data, but it takes a long time until it finally appears.  The number of citations is incorrect, the list of conferences, too.

I put most of my publications in ZORA, the Zurich Open Repository and Archive, hosted by the University of Zurich:

I don't maintain this list anymore, so publications there stop in 2011.

I list my publications on my website:

Here the order is by type of publication (journal articles, edited books, book chapters, conference/workshop papers, other) as usually done in the humanities.  If possible, I give the link to the full paper, i.e., to a repository (most of the time it's ZORA) or to the original article if published as open access.

And of course I need a list of publications for job and grant applications.  Here I order them by year and mark the topic (linguistics, NLP, e-learning, writing research).  Additionally, I list the publications "in press" with the date of submission of the final version.

And yes, publication number 32, final version submitted in October 2011, still is "in press".

Oh, and some of my publications are listed in the catalogue of the German National Library and in the ACM Digital Library.

I seem to have an account with Academia, which I don't maintain:

The only reason for this profile: I wanted to download a paper from there following a provided link.  But I had to click through a dozen of pop-ups creating my own account before being able to do so.  I never access this site and so it includes invalid information, most importantly: No information on co-authors is given.  Actually, it seems to be impossible to list co-authors at all.

I still have an account at ResearcherID:

I don't maintain this account.  The interface is ugly and maintaining the list is annoying -- you have to use an online version of EndNote, which is slow and not very user friendly at the same time.  I think I also lost the password, so I cannot delete this account.

I also had an account with Mendeley, but I managed to delete this one some months ago.  It was as user friendly as ResearcherID, I never used it.

That's a lot of places.  But wait, I also added some publications to the Konstanzer Online-Publikations-System(KOPS) and some to the Uni Basel Research Database (those publications that had been published when I was employed at these places -- and yes, some publications appear at both places, some because I had been employed at both places at the same time, some because the co-authors belong to different institutions): 

I used to list my publications also on the website I had at the University of Zurich and at the University of Basel -- I don't maintain a website at the University of Konstanz, there is only a link to my private-professional website. And when I start a new job at another institution, I will probably link to my CiteULike list of publications only.

But maybe the perfect repository is still to come?

1 comment:

mxp said...

I very much enjoyed this post, right on spot. In fact, it prompted me to write a blog post on my own to add a few points.