The Mirage of Prestige

About the Prestige Factor Database

 

Prestige Factor, the company, its products and services folded after I submitted my manuscript. In this scenario Information Today, Inc.  decided to provide a hot-link from the commentary about the case by Paula Hane to this review. 

Prestige is a highly positive word, and so are its synonyms. To include it in a product/service name such as in the Prestige Factor database (published by the eponymous Canadian company), may have seemed a particularly smart decision as prestige is also quite an international word - as we shall see in more ways than one..

 

As a bonus, the name also implies rubbing shoulder with, or at least being in the same league as the Impact Factor of the Journal Citations Reports (JCR) database of the Institute for Scientific Information (ISI). The impact factor has been for decades the most widely used yardstick to express the impact, status, standing, renown, importance, prominence, and influence, i.e. the prestige of more than 7,100 journals in the sciences and social sciences which have been selected by ISI for measuring their impact factor.
 Although the impact factor is not a perfect measure it is far more reliable, and reproducible than any of the other methods (usage statistics, subjective evaluations by scientists and/or practitioners), and covers far more journals in a relatively systematic manner, producing invaluable historical data series about the citation pattern of journals.

 

 

About the Impact Factor

 

In my salad days of reviewing CD-ROM databases from the mid-1980s, the CD-ROM version of JCR was one of the three most important databases I used (along with the first edition of the Grolier Encyclopedia, which pioneered the technology, and the Oxford English Dictionary, which just took my breath away despite the clumsiness of the software in the early edition. All of these offered features that users of the print/microfiche versions could not even dream of. I reviewed JCR both in my CD-ROM Currents column  and in the Peter's Picks and Pans column

Before exploring how a small company could undertake such a daunting task of measuring the prestige of several thousand journals let’s have a quick review of the main traits of the JCR databases.

Having spent hundreds of hours with the JCR database since the release of its CD-ROM version in 1994 then the Web version a couple of years ago, I am both enthusiastic and critical about the JCR. JCR has an unparalleled collection of citations received and given by the more than 7,100 journals selected from the larger stable of journals processed for the Science Citation, Social Science Citation, and Arts & Humanities Citation Indexes for decades.

To get a sense of the volume of citations, suffice the example that the most current JCR has 16,421,329 citations received by 5,684 science journals and 1,121,127 citations received by 1,697 social science journals in  2000 alone. Add to this the fact that JCR processes for every edition 3 years worth of citations, although for calculating the impact factor itself only two years of data is used – for good reasons.

 

The impact factor is calculated by dividing the number of citations received by a journal in the previous two years  by the number of certain items published in the same two years in the journal.

The current year’s citations and published items are used to calculate the immediacy index. The higher the immediacy index value, the “hotter” the journal, as it starts being cited as soon as it is off the press. Obviously, a journal published twice a year, say in June and December, cannot compete with the immediacy index of a weekly publication.

.

 

The impact factor algorithm has a deficiency: the denominator includes only original research articles, articles which review the literature of a specific topic (like human-computer interface), and notes (shorter research articles).

In determining the nominator, there is no distinction made by the type of items cited, every citation received by a journal is counted. Therefore, journals that have many book reviews, letters to the editors, editorial materials and other document types which are not counted in the denominator, may have a higher impact factor than they deserve, as these are cited in corrections, responses, rebuttals and retractions – all of which increase the nominator. See my article about this problem in Cortex.

A Deficiency in the Algorithm for Calculating the Impact Factor of Scholarly Journals: The Journal Impact Factor. Cortex  37(4), 2001, p. 590-594.

 

This problem is aggravated by the fact that the assignment of document types is sometimes erroneous, and often inconsistent. This may lead to such absurd situation as I reported about Contemporary Psychology, a journal which has only book reviews and a few editorial materials. It was ranked #1 among nearly 500  psychology and psychiatry journals, and #2 among all the 1,600+ social sciences journals in the 1999 JCR edition by impact factor.

In spite of its deficiencies the JCR database – when properly used- can provide unprecedented insight about the scholarly impact of journals. It takes quite some self-confidence (and something else) to launch a service to compete with JCR, as Prestige Factor did.

The Number Game. Online Information Review "Savvy Searching" column 24(2) 2000, p. 180-183. See article in PDF format

See eXTRA related article

 

 

Essential misconceptions and disinformation of the Prestige Factor database

 

The FAQ file of the Prestige Factor database is full of misconceptions and misinformation. PF is not really a database from the users’ perspective, but merely a static and well presented collection of data in PDF format. It is like a print publication in digital format with quite simple navigation and very rudimentary search options.

Its bold claim on the opening page sets the wrong tone, by saying that the "IF measures the frequency that a journal is cited by other journals”.

 

It is a well-known fact that almost all the journals receive the largest portion of their citations from themselves, as illustrated by  the citations to Psychological Reports. Entries in this table of JCR are sorted by the  number of citations received by the cited journal from the different citing journals.

 

A paraphrased version of the false information dispensed in the banner ad shown above, touts the advantage of PF, claiming that it measures the true value of academic journals. It also belittles JCR throughout the FAQ pages, using grossly misleading information about the competitor. This is more than odd when you realize that without the JCR data PF could not exist. The PF blurb sounds like an excerpt from the script of a late night infomercial.

 

The side-by-side comparison of the PF and the JCR databases (erroneously alluded to as the Impact Factor database in the FAQ) is misleading in a foxy way. When comparing the number of journals in the Social Sciences edition of the two databases, the FAQ claims that JCR covers >1,000 journals while PF covers 1,468 journals. One would get the impression that PF is almost 50% larger than JCR.

In reality, the JCR 2000 Social Sciences edition includes 1,697 journals. Indeed, legally and  technically it is more than 1,000, but it is as bamboozling as to say that Snow White lived with more than  4 dwarfs. The correct information is readily available when one opens the JCR database. This is a cheap method of  disinformation, especially when the PF database –as I will discuss below- uses tens of millions of JCR data points for calculating the prestige factor for all of the journals in its social sciences edition which I thoroughly analyzed.

 

The FAQ file answers the question what is the selection criteria for including a journal in PF. The answer sounds good, but I have a shorter one. The criteria  is that the journal be covered by the JCR database set. Why? Because PF lifts the mountain of source data from the various component files of the JCR database, then massages the data, applies undisclosed algorithms, and comes up with a variety of scores, including its most touted prestige factor.

See it enlarged

 

Out of the 1,468 journals covered by the Social Sciences edition of Prestige Factor, 1,467 were in one of the two JCR 2000 databases. The only one which I was unable to identify and match in JCR was a journal abbreviated by PF as Mag-Econ-Finan-Meth-I-C.

The PF FAQ harps about the advantage of PF using longer abbreviations than JCR. Again, it is true, but it is not the whole truth. JCR also has the full title of all the journals, the ISSN, the publisher, the frequency, language and country  of publication. PF has none of them. Many of its abbreviations are identical to those in JCR, and some are almost impossible to decipher.

 

As for classifying journals into Sciences and Social Sciences and then into categories, PF has an explanation. I think the classification has more to do with lessening the spitting image feel of resemblance of the JCR journals in the identical categories than with reasonable grouping of journals by subject. Make no mistake, the policy is laudable, but the practice is certainly not.

See it enlarged

 

I struggle to understand, for example, how could Patient Education and Counseling land in the Information and Library Science category either by the first criterion (journals with the highest percentages of articles in the subject), or the second criterion (journals with the highest number of articles in the subject). Either the software would have needed some serious recalibration, or the experts should have shown more expertise and refrain from assigning to the Information and Library Science category among other oddities this journal or the journal Landscape and Urban Planning.

 

These two journals along with the Drug Information Journal which is definitely not an information and library science journal just because the word information appears in its title, have 3-4 times higher prestige factor score than Library Quarterly, Library Resources & Technical Services, Library and Information Science Research, and leave in the dust many other respected journals in our profession.

 

In the unfair comparison table, PF also brags that its latest version is of 2001 vintage, with data for 1998,1999 and 2000, while JCR utilizes data only for 1998 and 1999. It is the same discombobulating comparison strategy mentioned above. JCR does utilize data from the year 2000, although not for calculating the impact factor, but calculating the immediacy index, ranking the journals by the ratio of citations received in 2000 to items published in 2000.

 

 

The voodoo factor in  Prestige Factor

 

There is a good reason for not incorporating  into the impact factor of JCR the citations to the most current year’s items. Only a tiny percent of items are cited in the year of publication. For the JCR 2000 Social Sciences edition, according to my calculation, merely 1.56% of all the citations received in 2000 were for items published in 2000.

This low percentage does not even reveal the fact that 437 of the 1,697 journals in the Social Science edition did not receive a single  citation and 253 journals got only 1 citation in 2000 for items published in the same year. This also puts in perspective how important the “innovation” of PF is by factoring in the calculation of the prestige factor the citation rate for the most current year.

The PF algorithm decreases the prestige (by PF’s standards, that is) of such indeed prestigious journals in our profession as Library Trends or Library and Information Science Research, which had no same-year citations in JCR 2000.

 

The FAQ file uses a single example to demonstrate the superiority of PF over JCR. As a law school graduate, I found it ludicrous that PF ranks the prestige of Indiana Law Journal (ILJ) ahead of the Stanford Law Review (SLR), and so would most  lawyers, judges, law professors and students. They would find the JCR impact factor much more realistic which ranks SLR #2 and ILJ #207 in the social sciences category. (It is not appropriate to compare scores in such a broad and diverse category as social sciences [instead of just law], but that’s the smallest problem with PF).

For starters, the U.S. Supreme Court cited SLR 8 times between 1996 and 1998. It did not cite ILJ at all in the same period – according to a recent article in the Indiana Law Journal, which certainly was not biased for SLR.

 

According to my research in the Social Science Citation Index, ILJ (which published 869 items between 1975 and 2001) received 2,281 citations for those items. SLR  published  1,123 items in the same 25 year period, which received 8,470 citations. Although I did not include citations possibly received from journals analyzed by Science Citation Index and/or Arts & Humanities Index, and I did not include those cited journal name variants which occurred less than 5 times or were not unambiguous, my estimated 25-year impact factors reflect fairly well the real prestige difference between the two journals.

 

Why did then PF choose these two journals as the example, and how could it arrive at the ludicrous prestige factor scores? The answer to the first question is simple, ILJ’s seemingly dramatic improvement in "prestige"  fit the agenda and the public relations rhetoric of PF.

The creators of PF were so much touting the magic of their formula that they then believed it, and accepted its ranking which identified Indiana Law Journal as the #1 journal by relative growth. The creators took at face value the numbers which they grabbed from JCR. To their misfortune the wrong number was exactly the one which PF so proudly bragged about and incorporated in the prestige factor algorithm – the number of citations received in 2000 for items published in 2000.

 

JCR itself has this information wrong, but id did not have an effect on the impact factor in the 2000 JCR edition, but would have on the 2001 JCR edition, unless it gets corrected. Even in the JCR 2000 edition it should have alerted the editors that ILJ achieved the #1 position by Immediacy Index among all the 1,679 social science journals, head and shoulder above the perennially hottest journals, Harvard Law Review, Stanford Law Review, that are cited before they are printed. Even if ILJ had published a joint article by Cochrane  and the defense attorney in the fatal mauling case, painting Simpson and the dogs as the victims, and dozens of letters to the editor in the next issue, such a rise for Indiana Law Journal from one year to next is not feasible, and neither is the absolute number of the total cites received in 2000.

 

Such errors really must be caught at the quality control stage, and as I preach it in my courses about database design, the software should flag numbers which are not within the plausibility range. Even for the naked eye the data for ILJ is obviously absurd. You don’t need to know law to recognize that something is fishy when a journal all of a sudden gets an eye-popping percent of the unusually high number of citations - and most of them for the same year when the article was published.

 

The real culprit this time was the Indiana Law Journal itself. In case you don’t know, university law  journals are edited by law students. I have met 3-4 students in my 20 years of teaching practice who could have  qualified already during their studies to be editors, but in law schools the majority of students become editors for a year or two, and exercise full editorial power. 

No wonder that Wendell Holmes in 1911 described student edited law journals from the bench of the U.S. Supreme Court as “the work of boys” – quoted by many articles which discuss the standing of law journals. Quite often these boys are the most self-assured on campus – second only to the bench warmers of the not so stellar football team.

The boys of the Indiana Law Journal apparently couldn’t get right the simple chronological and numerical designation of the serial. Volume 74 was published in 1999. Still, the Winter issue appeared with the year 1998. This was not the culprit for the 2000 mishap, but shows that such sloppiness is not an exception in the editorial office of this journal.

 

The absurd numbers for the ludicrous  ranking of the Indiana Law Journal in JCR and consequently in PF are due to another error by the editors who designated volume 74 number 4 as published in 2000 instead of 1999. Most (if not all) of the papers which cited articles from that issue used the wrong year which in turn were used by the citation indexes, then JCR and then PF – leading to an unprecedented immediacy index among not only law reviews but also social science journals in JCR, and to the glorified status of ILJ in PF which already used the 2000 citation data from JCR to calculate the prestige factor.

 

To the credit of the editors, the real 2000 issues have the correct volume and issue numbers. It still may not qualify as adult work, however, that in the table of contents the title of the article describing the history of the Indiana Law Journal in the first issue in 2000 has a typo. It is by the way, one of the articles which cite Justice Holmes disparaging opinion about student-edited law reviews.

 

 

More voodoo acts and claims

 

The Social Science edition of the PF database has 1,468 journals. 1,259 of the journals are also in the Social Science edition of JCR 2000. So the other 209 journals in PF are not covered by ISI? No. 208 of them are in the Sciences edition of JCR 2000. The source of the underlying data used for calculating the prestige factor  is nowhere mentioned in any of the PF documentation, but it is obviously the set of the two JCR databases.

In a sense PF's remaining silent about the sources of data is understandable. It would take several hundred thousand dollars to subscribe to   6,222 scholarly and professional journals covered by the entire PF database. The company would have needed 3 years of subscription just to create the 2001 edition (as they refer to it).

I don’t even venture to estimate how many hours of work would have been needed to process the more than 50 million citations.  The common subset of 1,259 social science journals  covered both by PF and JCR has 52,611 items, and   887,906 citations received for the year 2000 alone. Science journals have far more citations per article than social science journals. The 5,584 journals in the JCR 2000 Sciences edition received nearly 16.5 million citations in 2000 alone. The task of processing three years of citation data for the 6,222 PF journals is more than daunting, unless such data is readily available, and one helps himself to it to spare the lion's share of the work.

 

On the CD-ROM version of JCR all the data required for the calculation of the impact factor, the immediacy index and the cited and citing half-life measures of 7,129 journals are available – in a standard MS Access file format.

Some of what appears on the screen in the normal use mode, such as the Journal Ranking List (shown in the upper part of the image here) can be easily saved in a comma and quote delimited format,  which in turn can be imported into most database and spreadsheet programs. This is offered by ISI to make available some of the data for post-processing by regular users and it is certainly appreciated by researchers.

Some of the other lists are only viewable for mere mortals such as the Source Data shown in the lower part of the image. You may print the data individually for each journal but not en masse as you can do with the Journal Ranking list.

 

The very informative Cited Journal Listing and Citing Journal Listing tables below can be viewed and printed in their entirety, and those who grew up in the DOS operating environment may do some tweaking to  redirect the print output to a text file to do interesting calculations not available in the JCR, such as the self-citation rate of journals.

 

Then there is the raw data on the CD-ROM (meant for the program not for the users) which is not in an encoded and proprietary format, but in plain text format. The sea of numbers is not meant for the human eye and may be intimidating for a novice, but quite easily recognizable if you are familiar with the data elements used in JCR.

Even a novice can easily recognize and parse into a database or spreadsheet the citations data (such as which journals cited how many times in the past ten years the target journal).

ISI may not have contemplated the abuse of its invaluable collection of citation data, and left it pretty much unprotected. But just because the door of a Porsche is unlocked it does not mean that you can take it for a joy ride, let alone to repaint it, add little gizmos here and there, put a new license plate on it, then offer it for licensing, er, rental.

 

While JCR has all the information and makes public all the algorithms it uses to calculate the various measures, PF is as secretive as a group of tobacco industry executives regarding its sources, components, and algorithms. When it unveils a part of its algorithm, it is like the abracadabra of the magician. Highfalutin words like mothered term, scrutinized keywords, rescued keywords fill the page which make self-improvement books look unassuming. 

 

Interestingly, PF regains its clarity when it comes to define the Terms of Use, and limiting what you can do with the data. It was particularly funny to see the legal document stating that  “a derivative or product based on Prestige Factor scores is prohibited”. To me, the non-practicing legal eagle, PF itself seems to be clearly a derivative product of JCR. Obviously, PF did  not think so.

See it enlarged

 

The PF Terms of Use document also regulates legal proceedings, and defines the location for any arbitration proceedings as Santo Domingo in the Dominican Republic. It is an odd location for a Canadian company doing business with predominantly European, Canadian and U.S. companies. Yes, Santo Domingo is well-known for quick verdicts in divorce cases, but hardly for copyright infringement suits.  It seems that the lawyers of PF will have to pack their suits for some travel not to Santo Domingo but to New York where ISI filed its civil suit against the company. We shall see if the federal court will consider PF as a derivative work and hence a copyright infringement.

See it enlarged

 

I couldn’t fathom for a while why was the Prestige Factor name chosen for something that at best would qualify as novelty factor, a.k.a. immediacy index in a research exercise using JCR data. My linguistic interest helped me out. Prestige comes from the Latin word praestigiae which meant delusion, illusion and trick used by jugglers. It is labeled as archaic by most dictionaries which mention this meaning of the word. I think PF is a novel illusion of the 21st century, and you had better stay away from it.

 

Back to the eXTRA menu