Thursday, February 7, 2013

Impact factor vs. H-index

I've been thinking about journal rankings and impact factors lately, partly because I noticed that impact factors and H-index sometimes give quite different rankings for journals. To pick one example, Neuropsychologia and Cortex are two very good cognitive neuroscience journals that publish similar sorts of articles, but the impact factor for Cortex is substantially higher than for Neuropsychologia (6.08 vs. 3.636), whereas the H-index is substantially higher for Neuropsychologia than for Cortex (67 vs. 41). In case you are unfamiliar with these measures, impact factor is basically the mean number of citations to articles that were published in the past 2 years; H-index is the largest number h such that h articles have at least h citations each. So the articles published in Cortex have been cited an average of about 6 times and there are 41 articles that have been cited at least 41 times; the articles in Neuropsychologia have been cited an average of about 3.6 times and there are 67 articles that have been cited at least 67 times. Since both measures are based on citation rates, why do they give different rankings? 



One simple factor is timing. The standard impact factor is computed over the past 2 years, but the default H-index (at least on Google Scholar) is over the past 5 years. The 5-year impact factors for Neuropsychologia and Cortex are almost the same (4.504 and 4.765, respectively), so maybe it's just that citations to articles in Cortex tend to happen within the first 2 years after the articles come out (more on this in a minute). Two years might be a reasonable time window for some fields, but in psychology and cognitive neuroscience, where the review process often takes 6 months or more and an article can be "in press" for another 12 months, 2 years is way too short to measure an article's impact.

Plus, timing can't be the whole story because h-index rates Neuropsychologia substantially above Cortex, not just equal to it. A second factor might the distribution of citation rates. The figure below shows two hypothetical distributions of citation rates that have the same mean (~3, indicated by the vertical black line) and different amounts of skew. Because they have the same mean, they have the same impact factor, but the high skew distribution produces a much higher h-index (10 vs. 5).
So one possibility is that the citation rates are more skewed for Neuropsychologia than for Cortex. If Neuropsychologia has a larger right tail (highly-cited articles) then the higher h-index is correctly capturing its greater impact on the field. But what about the left side of the distribution -- what does it mean if the rarely-cited articles in Cortex tend to have, say, 2 citations instead of 1? 

It so happens that in 2012 Cortex published an article examining whether country of origin affects length of the review process (Valkonen & De Lucia, 2012) and this article seems to have cited a very large proportion of the articles published in Cortex between 2010 and 2011. Indeed, a similar article examining geographical distribution of Cortex publications was published in 2010 (Foley & Della Sala, 2010) and cited just about every article published in Cortex between 2008 and 2009. If I understand the impact factor calculation correctly, each of these two publications increased Cortex's impact factor by an entire point. Given its current impact factor of around 6, that one point means that the impact factor would have been around 5, so one article raised the journal's impact factor by about 20%. It also (probably) increased the h-index, but by a comparatively small amount because only the articles that were already near the h-threshold (41) would have been affected --  articles that already had many more or many fewer than 41 citations would not be affected by one more citation.

I am interested in the meta issues of scientific publication (obviously, since I am writing this post about impact factors), so I found both of those articles about country of origin effect quite interesting. However, even if these articles have no impact on the field whatsoever, they seem to have given the journal's impact factor a huge boost. 

This is not to pick on Cortex, which I think is an excellent journal that consistently publishes high-quality articles (full disclosure: I have an article in press there). My point is that, notwithstanding its name, the impact factor is not a very good measure of impact. It is hard to quantify scientific impact and no single number - no matter how cleverly computed - should be considered the whole story, but if I had to pick a number, I'd take h-index over impact factor.

7 comments:

  1. Very interesting summary!
    I have just cited your post in our website, home of the International College of Affective Neuroscience (ICANS)
    Regards,
    Beatriz

    ReplyDelete
  2. Quick question: is the H index also dependent on the total number of published articles in a given time period? E.g. if a journal published 100 articles over 5 years, its H index can't every be higher than 100, whereas a journal that published 1000 articles during the same time has a much higher ceiling.

    ReplyDelete
    Replies
    1. Yes, the number of articles does impose an upper limit on h-index but not impact factor. However, for that limit to be reached, all 100 articles would have to get at least 100 citations. Given that modal number of citations for a scientific article is 0 and only a small proportion even get 10, it seems unlikely that this limit would ever come in to play.

      Delete
  3. Thanks for clearing this up and that too in an easily understandable way, it had been bugging me for a long time.

    ReplyDelete