The following article was first published in Chemical Heritage, 18(3) (Fall 2000), pp. 43–45. It is reproduced here with permission.
How Laboratory Explosions Led Me to Document the Information Explosion
By Eugene Garfield
Eugene Garfield, founder and chairman emeritus of the Institute for Scientific Information (ISI) and publisher of The Scientist, was a pioneer in science information systems, particularly in the chemical sciences. ISI’s Index Chemicus, Science Citation Index, and series of Current Contents services all cover various facets of chemistry. In this autobiographical sketch, Garfield traces his personal evolution from laboratory chemist to information entrepreneur.
In 1949, as a young chemist with a recent B.S. from Columbia University, I spent my days in the lab at Evans R&D, a chemical consulting firm, measuring the viscosity of potential shampoo products. Two events signaled that I would eventually forsake the field of chemical research for my current career in scientific information. The first was a laboratory explosion, one in which a fellow lab worker was injured. Less dramatically, I was one day asked to summarize a meeting with a client who was investigating substitutes for natural products imported from Asia; someone had noticed that my résumé listed typing and the office skills I had learned in my previous job as a sales clerk at LaSalle University. My success in summarizing the meeting was an early sign of my aptitude for extracting the important nub of information from the larger picture.
While still at Evans, I received a call from my cousin Sid Bernhard, who was working on his doctorate in chemistry at Columbia. Sid’s professor, the eminent physical organic chemist Louis P. Hammett, had an opening for a lab assistant. So I gladly left Evans for Hammett’s lab, where I was assigned the task of synthesizing a long series of esters. These were needed to test theories of acid-base catalysis.
One day, after months of seemingly endless lab work, I discovered a hall closet outside the lab in Havemeyer Hall where decades of graduate students had deposited thousands of synthesized compounds. When I discovered that some were those I had already produced, I decided to search the collection before doing another synthesis from scratch. I soon learned the value of searching the literature—for the same reason. Access was easy: Hammett was editor of the McGraw-Hill chemistry book series, and his excellent personal library included a complete set of Chemical Abstracts, published by the American Chemical Society (ACS). But my second laboratory explosion prompted me to leave Hammett’s lab and pursue a different career path. Relying on my stenographic and typing skills, I applied for a post as chemical secretary to a research director at the Ethyl Corporation. My interview took place at the 75th anniversary meeting of the ACS in New York City in March 1951.
After the interview I stumbled by accident into the sessions of the Division of Chemical Literature, where I met James W. Perry. He soon introduced me to Sanford Larkey of the Welch Medical Library at the Johns Hopkins University. Within a few months Larkey employed me as a research assistant on the Welch Medical Project, which dealt with machine methods for indexing for the Army Medical Library. When Larkey asked Hammett for a reference, Hammett replied: "Garfield is not particularly imaginative but a very hard-working fellow"—just what Larkey wanted. I have often stated that my years at the Welch Medical Library (1951–53) laid the foundations for the establishment of the Institute for Scientific Information (ISI).
At Johns Hopkins I became steeped in the details involved in producing Chemical Abstracts and other abstracting services. I served as a volunteer abstractor of Spanish pharmacology papers. As the Welch Medical Project chemist, I became familiar with a variety of chemical information activities and met most of the information pioneers who were the backbone of the American Documentation Institute, now the American Society for Information Science (ASIS). Since I had worked at the Welch Project on the chemical nomenclature used in Medical Subject Headings (MeSH), I understood the need for new approaches to retrieving chemical information, and I was mentally prepared to meet further challenges in that arena.
After leaving the Welch Project in June 1953, I attended the Columbia University School of Library Science from 1953 to 1954. During that time I wrote my primordial paper, "Citation Indexes for Science," published in 1955 in Science. A year later I also published a paper on an experimental citation index for searching chemical patents. This first patent citation index was based on a file of several thousand patents supplied to me by Marge Courain of Merck and Company in Rahway, New Jersey, who had been a classmate in library school.
In the summer of 1954 I went to work as a consultant for Smith, Kline and French Laboratories in Philadelphia. Within a year I began the biweekly service Management DocuMation Preview, later changed to Current Contents of Management publications. This Current Contents service preceded the weekly service started in 1956 that became Current Contents of Pharmaco-Medical, Chemical and Life Sciences. Miles Labs was the first company to contract for this product; others soon followed, paying $1,500 for 25 copies per year. The impact of the Current Contents services (seven in all) has essentially been ignored by those who discuss major advances in the history of scientific information processing. Its simplicity and unconventional format left little for theoreticians to discuss. And its timing was in fact revolutionary: it had been unheard of to deliver indexed information weekly.
In 1958 Robert A. Harte of Merck, who chaired a committee of the Pharmaceutical Manufacturers Association, negotiated a contract with my company (then Eugene Garfield Associates, Information Engineers) to index and code all new steroid compounds reported in the literature. The data were to be delivered to the U.S. Patent Office. Donald Andrews was to lead a group of patent examiners who would test the feasibility of conducting patent searches using IBM punched cards. The scanning and coding work on this contract led me to recognize that it would be possible to locate and identify newly reported compounds algorithmically. That recognition eventually led me to propose the Index Chemicus in 1959 and to launch it in 1960. The first issue was dedicated to my mentor at Smith, Kline and French, Ted Herdegen, who died that year.
We took a fundamentally new approach to handling chemical information. Four of the seven elements of that approach relied on recognizing the linguistic nature of chemistry.
First, as early as 1958 I recognized that a systematic chemical name could be algorithmically converted to a molecular formula and subsequently to a line notation.
Second, I realized that synthetic chemistry journals almost universally called out the new chemical compounds by providing their molecular formulas. Therefore it was not necessary to convert all compounds to systematic names in order to create a molecular formula index.
Third, Chemical Abstracts missed many intermediary compounds not identified by molecular formulas: Index Chemicus provided this small but important added value. The focus on intermediates was essential for patent purposes.
Fourth, we recognized how essential the graphical presentation of structural information was to the specialist in organic chemistry.
A fifth and not the least advantage was rapid turnaround time. In those days Chemical Abstracts was quite late in producing its molecular formula indexes, and we took the revolutionary step of producing them first monthly and then weekly. We also received all foreign journals by airmail and stuck to a rigid production schedule.
Sixth, based on our experience with Current Contents and the steroid coding project for the Patent Office, we knew that Bradford’s law of scattering—which holds that the more one strives for complete coverage of the literature, the more journals one needs to examine at increasing distances from the center of a given field—applied in chemistry. But the converse, later called Garfield’s law of concentration, holds that covering the journals in the center is far more productive. To this day, fewer than 100 journals produce over 95% of the papers reporting new chemical compounds, even though hundreds more are scanned for them. Relying on this law of concentration allowed us to compete effectively with Chemical Abstracts and its coverage of thousands of journals.
A seventh advantage of Index Chemicus was its rapid coverage of foreign journals. For example, David Jordan, an American organic chemist, indexed all the airmailed Japanese journals weekly, at a time when Japanese scientists published most of their output in Japanese. (They now publish largely in English.) Chemical Abstracts was far behind because it relied on volunteers for all languages, including English. Its indexing system added further delays.
From 1956 to 1960 I implored Chemical Abstracts to modify their approach to indexing and also to adopt citation indexing, in line with the points listed above—as a group in the Philadelphia section of the ACS, chaired by Max Gordon, had proposed. But they decided to continue on their traditional path. I nevertheless maintained nothing but the friendliest relations with the people at Chemical Abstracts, though I must say that I felt great animosity towards the National Science Foundation for preferentially supporting their work simply because they were nonprofit.
An integral part of this brief history is my work in chemical linguistics, which began officially at the University of Pennsylvania when I signed up in 1955 for a Ph.D. program. This culminated in my 1961 dissertation, "An Algorithm for Translating Chemical Names to Molecular Formulas," a topic recently referred to as "one of the earliest points of departure in computer handling of chemical information."
From the launch of Index Chemicus in 1960 to the present, ISI introduced many innovative features and changes, with no support whatsoever from the government. Another ISI service was the Index Chemicus Registry System (ICRS) which pioneered the use of Wiswesser Line Notation (WLN), the system invented by William Wiswesser that uses the Roman alphabet and numerics to describe compounds in a linear notation that can be typed on an ordinary typewriter. WLN enabled substructure searching well before Chemical Abstracts launched such searching. Every major pharmaceutical company subscribed to ICRS and began using WLN for their internal files. ISI made substructure searching of the open literature a reality.
To be a pioneer is to struggle, and launching Index Chemicus was no exception, although the unique struggle to keep this product alive is not well known. Industry supported us by way of paid subscriptions and moral support. But even from the outset, the $2,000 annual commitment from each of the 12 sponsoring firms was far from sufficient to cover costs. What began as a simple molecular formula index that would have cost us about $25,000 per year to produce, quickly evolved into a full-blown graphical abstracting service. I had underestimated how long it would take for other firms and universities to adopt this product. In the early 1960s our red ink caused all four of my vice presidents to leave and form a competitive company—which eventually failed. Any sensible corporate executive in an established company would have agreed with them and abandoned Index Chemicus.
My obsessive attachment to Index Chemicus may be explained as foolish loyalty, commitment, and stubbornness or fear of failure, but in the end—thanks to the income from Current Contents—Index Chemicus and Science Citation Index survived and are now part of an integrated system that includes citation indexing links to and from chemical literature and chemical patents.
Perhaps the best way to end this short history is to say that ISI’s chemical information services, including Index Chemicus, are alive and well—as are the competitive products that they inspired. As Emerson said, "Invention breeds invention," and perhaps one of the key legacies of ISI’s entry into chemical information is that our nontraditional approach accelerated change and prompted the development of a whole new generation of chemical information products and services.
A version of this article was presented at the American Society for Information Science (ASIS) Meeting on Historical Perspectives on Knowledge Dissemination, Washington, D.C., 1 November 1999.