Archive for the ‘scientometrics’ Category

The “real” impact factor of Nucleic Acids Research is ~5.6

Posted 27 Apr 2010 — by maxhaussler
Category scientometrics, text mining

Nucleic Acids Research is a respected journal, publishing articles about e.g. restriction enzymes, and DNA analysis. Twice a year they have a “special issue” with updates on databases and bioinformatics tools on the internet. These short “method papers” usually just resume what has been added to the database/tool and rarely report research results directly. But they do attract a lot of citations: people who are using a certain website or software tool are expected to cite the corresponding method article in Nucleic Acids Research.

Is this practice increasing the impact factor of this journal? Certainly, but how much? It turned out that the answer to this question takes only 45 minutes of web searching and a one-line program (or an excel formula).

The impact factor 2008 is calculated based on articles in 2006 and 2007. So I’ve downloaded the citation data from and by dividing number of citations (14957) by number of articles (2260), arrived at an impact factor of 6.61 (the official impact factor is 6.87, this is probably because my list copied from Scopus includes some articles which are considered “not-citable” by ISI Thomson or because scopus has less data than Thomson). The 444 articles in special issues attracted 4750 citations. If NAR did not have special issues, its impact factor would therefore be around 5.6 (a bit higher, perhaps 5.8, due to the non-citable issue).

The data from scopus therefore gives everyone the possibility to quantify how much methods, reviews or original research determines the impact factor of a journal.

How to redo this: Go to, search for “srctitle(nucleic acids research)”, select “Nucleic Acids Research” and Year:2006, click limit, click “Citation tracker” and export as a textfile. Repeated the same thing for 2007. Convert the text files with excel to tab-separated files and run the following one-liner on them:

cat 2006.txt 2007.txt | grep -v rratum | gawk 'BEGIN {FS="\t";}
/^200[0-9]/ {articles+=1; cites+=$11; if ($7=="") { webArticles+=1; webCites+=$11;}}
END {print "articles:"articles; print "citations:"cites; print "impact factor:" (cites/articles);
print "web/db articles:"webArticles; print "web/db article citations "webCites;
print "impact factor without web/db issue:" ((cites - webCites)/(articles- webArticles));}'

Note: a previous version of this post estimated the impact factor to be 4.5. This was wrong, as I forgot to remove the number of articles in the special issues from the total number of articles. I am very sorry for this bug.