Environmental health citation patterns: mapping the literature 2008-2010.

OBJECTIVE
This research seeks to understand the publication types and ages cited most often in environmental health literature and the most commonly cited journal titles.


METHODS
From the 43,896 items cited in Environmental Health Perspectives and the Journal of Environmental Health during 2008-2010, 2 random samples were drawn: First, 1,042 items representing all citations were analyzed with respect to publication type, age, and Internet link. Second, the cited journal name and citation age were recorded for 1,038 items culled from only citations to journal articles. All journal titles were classified into Bradford zones and assigned subject disciplines.


RESULTS
Journal articles (n = 891, 85.5%) were the most heavily cited publication type. Cited items' publication years ranged from 1951 to 2010. Close to half (49.1%) of all cited items were published 5 or fewer years previous. Sixteen journal titles (3.9%) accounted for 32.5% of all cited journal articles. The 3 most common subject disciplines-"Public, Environmental & Occupational Health," "Environmental Sciences," and "Toxicology"-accounted for 21.6% of all unique journal titles and 45.3% of all citations.


CONCLUSIONS
Environmental health citation patterns differ from other public health disciplines in terms of cited publication types, cited journals, and age of citations.


INTRODUCTION
Environmental health is ''the art and science of protecting against environmental factors that may adversely impact human health or the ecological balances essential to long-term human health and environmental quality'' [1]. Making up an estimated 10%-21% of the total public health workforce, environmental health professionals work in food safety, air and water quality, sanitation, toxicology, emergency preparedness, and other occupations [2]. Research literature relevant to environmental health practice falls within a number of disciplinary domains, including health and medicine, the natural and physical sciences, and engineering.
Previous analyses of the environmental health journal literature identified topic trends [3,4], the degree to which environmental health topics were being covered in medical journals [3], and the geographic distribution of environmental health research in Europe [4]. The current citation analysis focuses instead on the types of materials being cited in the environmental health journal literature. It follows up on the work of Rethlefsen and Wallis, who mapped the literature of public health more broadly in an analysis of citations from the American Journal of Public Health [5]. In addition to journals, books, and other published sources, the current study identifies commonly cited sources of gray literature in environmental health and the extent to which Internet sources are relied on in formal environmental health research. For librarians who support environmental health programs, this study contributes to an evidence base to guide collection management decisions in a time of reduced budgets and increasing subscription prices. For environmental health researchers and practitioners, this study sheds light on influences and potential biases in this realm of knowledge.

METHODOLOGY
The researchers first selected two environmental health journals, Environmental Health Perspectives (EHP) and the Journal of Environmental Health (JEH), for the study.
Supplemental Table 4 is available with the online version of this journal.  N Librarians with responsibility for collection development in environmental health should note that environmental health professionals cite materials in a large range of science, health sciences, and public health disciplines.
EHP has the third-highest impact factor in the ''Public, Environmental & Occupational Health'' category of the 2010 Journal Citation Reports (JCR) Science Edition and is the top environmental health-themed title in that category. In addition, EHP is the official journal of the National Institute of Environmental Health Sciences (NIEHS), one of the National Institutes of Health (NIH). Its mission is to ''reduce the burden of human illness and disability by understanding how the environment influences the development and progression of human disease'' [6]. The flagship NIEHS journal, EHP, is open access and has a Chinese-language counterpart. JEH is the official journal of the National Environmental Health Association (NEHA), an association of environmental health professionals, including sanitarians, food safety professionals, and hazardous substances specialists, amongst others. Whereas EHP is geared toward researchers and academics, JEH, ranked 117 in the ''Public, Environmental & Occupational Health'' category of the 2010 JCR Science Edition, is aimed at environmental health professionals working in state and local governments. Both titles are in the essential core for environmental health sciences in the Core Public Health Journals Project Version 2.0 [7].
The researchers used a standard protocol established by the Mapping the Literature of Public Health group for this citation analysis, hereafter referred to as the Mapping group [8]. This protocol reflects the base protocol developed by MLA's Nursing and Allied Health Resources Section (NAHRS) for their mapping the literature of allied health and nursing projects, with modifications first used by Rethlefsen and Wallis in their American Journal of Public Health citation patterns study [5,9].
All articles with bibliographical references during 2008-2010 from both EHP and JEH were considered for inclusion. In EHP, research, commentary, and reviews were included; all other materials, including correspondence and editorials, were excluded. In JEH, features; international perspectives; columns from the Centers for Disease Control and Prevention (CDC), Environmental Protection Agency, and Agency for Toxic Substances and Disease Registry; policy statements; review articles in the 2010 ''Inside the Profession'' column; special reports; and guest columns were included. All other article types, including correspondence, were excluded. The researchers excluded articles primarily to reduce potential citation bias, especially in correspondence and editorials. After manual review, a final total of 985 citing articles yielded 43,896 cited items, 37,477 (85.4%) of which were journal articles. Each cited item and each cited journal article were given a unique number for later identification.
Because of the dual purpose of this study, both to examine the age and type of cited items as well as the major journals cited, the researchers drew 2 samples. The first sample came from the overall pool of 43,896 items and provided data on types of cited item and the age of cited items at publication. The second sample is from the pool of 37,477 cited journal articles and provided insight into the most commonly cited journal titles and the scope of cited disciplines. Using the Mapping group's protocol, the researchers used an online sample size generator to determine a sample size that had a 95% confidence level and a +/23% confidence interval. For the first sample, 1,042 items from the 43,896 cited items pool were needed; for the second sample, 1,038 items from the 37,477 cited journal articles pool were required. To pull citations for each sample, an online random number generator was used. Because the samples were not exclusive of the other, cited journal articles could be in both drawn samples.
For each item in the cited item sample, publication type, cited item date, cited item age, and uniform resource locator (URL) (yes/no) data were collected in an Excel spreadsheet. Publication type was stratified into four categories: journal article, book, government document, and miscellaneous. Government documents included reports and major publications only; government serials were coded as journal articles, with the exception of monographic or statistical series such as the National Center for Health Statistics' Vital and Health Statistics Series. Public laws, the Code of Federal Regulations, or government web pages that could also be considered government publications were coded as miscellaneous. The miscellaneous category was also used for other web content, news items, meeting abstracts, and more. In addition, researchers chose to collect additional details about sources beyond the standard protocol on miscellaneous, government document, and book items. Cited item age was calculated by subtracting the cited item publication date from the date of the journal article citing the item. Works in press and electronic prints ahead of publication were counted as same year publications. Lastly, items with digital object identifiers (DOIs), with e-pages, and from online-only journals but lacking an URL in the citation were noted. Cited item age and publication type data were analyzed using JMP 9.0 statistical software for cross-tabulation and the chi square statistic with a P-value threshold of 0.05.
For the cited journal articles sample, journal title, publication year, and publication age were collected for each cited article. Publication age was determined as in the cited items sample. The most recent journal title was used for all cited journal articles where possible. When a journal split into multiple parts after citation, such as the Journal of Toxicology and Environmental Health, which split into Parts A and B in 1998, the older title was used. Citations with obvious errors or without full journal title information were verified using PubMed, Google Scholar, and the Science Citation Index where necessary.
Journal titles identified in the cited journal articles sample were also analyzed in Bradford zones. The Bradford analysis is a standard methodology for identifying core journals in a field [10]. Journal titles are ranked in a list according to how frequently they are cited. The total list of all citations is divided into equal thirds, called zones. Titles falling within Zone 1 are the most frequently cited journals. Zones 2 and 3 titles are cited less frequently. According to Bradford's Rethlefsen and Aldrich Law of Scattering, one can expect a relatively small number of highly influential journals in a field (Zone 1), a greater number of moderately influential journals (Zone 2), and an even greater number of marginally influential journals (Zone 3) [10]. This is a law of diminishing returns. A researcher has the most to gain from accessing Zone 1 titles. Access to Zones 2 and 3 titles is also beneficial, but progressively less so.
Because the Mapping group has used a sample instead of the full data set, journal titles were also categorized into subjects using the assigned subjects for each title in the 2010 JCR Science Edition and, if no title entry existed in the JCR Science Edition, the JCR Social Science Edition. For recently ceased titles, older versions of the JCR Science Edition were consulted to determine an older subject discipline categorization where possible. For the thirty-five titles where no subject categorization existed in the JCR Science or Social Science editions, similar journals were used as patterns to assign subject disciplines. The number of total titles and total cited journal articles per category were counted.

RESULTS
Overall cited items sample (n=1,042) As in the total cited items pool, journal articles (n5891) were the most heavily cited publication type (85.5% in the sample versus 85.4% in the total pool). Miscellaneous items were the next most cited publication type (n561, 5.9%), followed closely by government publications (n554, 5.2%) ( Table 1). Books were the least commonly cited (n536, 3.5%). Seventy-three items (7.0%) had an URL, though an additional 24 (2.3%) included a DOI or were e-only content, for a total of 9.3% of items (n597) with some type of reference to an online presence.
Cited items' publication years ranged from 1951 to 2010, with a mean year of 2001, median of 2003, and mode of 2006. Close to half (49.1%) of all cited items were published 5 or fewer years previous, though materials were cited with some frequency over the next 50 years. Figure 1 shows a classic long tail distribution of cited items' publication dates. Table 2 shows cited items' age by type of publication. Because of the small numbers of some of the expected values in the cross-tabulation, a chi square statistic is suspect. There does appear, however, to be a trend toward miscellaneous cited items being younger than other types of cited items. Books were the most uniformly cited across all 5 age categories, not showing the sharp drop-off after 10 or 15 years of age as the other categories did.
To get a better picture of the types of gray literature used by environmental health researchers and professionals, basic data about the source of government documents and miscellaneous materials were examined. The collected data showed a large range of agencies and organizations as publishers, and types of citations varied wildly. Frequently cited government agencies included the Environmental Protection Agency; CDC and its subsidiaries, the Agency for Toxic Substances and Disease Registry and the National Center for Health Statistics; National Institute for Occupational Safety and Health; Food and Drug Administration; US Geological Survey; and the National Oceanic and Atmospheric Administration, amongst others. Other organizations with frequent citations included the Health Effects Institute and the World Health Organization. Many local and international data sets and governmental reports were also cited. Federal laws and regulations, such as the Occupational Health and Safety Act of 1970, made up many of the miscellaneous citations.
Cited journal articles sample (n=1,038) From the cited journal articles sample (n51,038), 3 items previously erroneously identified as journal titles were excluded from the final analysis after researching the titles. The remaining 1,035 items from the cited journal articles sample yielded 408 unique journal titles. Sixteen journal titles (3.9%) accounted for approximately the top third (32.5%) of all cited journal articles. These could be considered the Zone 1  Environmental health citation patterns titles in a Bradford analysis (Table 3). Because this study used a sample, drawing concrete conclusions about the relative importance of Zone 2 and Zone 3 titles is difficult. For instance, Zone 2 could end in the midst of journal titles cited 2 times; arbitrarily, the researchers chose to consider Zone 2 as titles cited between 3 and 9 times, accounting for the next 30.0% of all citations (Table 4, online only). Zones 1 and 2 consist, then, of 62.5% of the cited journal articles. Even within the top Zone 1 cited titles, journals' disciplines varied. Environmental sciences, public health, general medicine, toxicology, and even basic sciences were represented. This same trend occurred when examining journals' disciplines across the full range of cited titles. Utilizing all Science Citation Index (SCI) and Social Sciences Citation Index (SSCI) subject discipline categorizations across all 409 cited journal titles, 98 different subject disciplines were represented. The 3 most common subject disciplines, both by number of individual journal titles cited and by number of articles cited within those titles, were ''Public, Environmental & Occupational Health'' (n of titles550 (7.9%); n of articles5320 (19.1%)); ''Environmental Sciences'' (n of titles546 (7.3%); n of arti-cles5278 (16.6%)); and ''Toxicology'' (n of titles540 (9.6%); n of articles5161 (6.4%)). These 3 subject disciplines accounted for 21.6% of all unique journal titles and 45.3% of all citations. Several subject disciplines had a large number of unique journal titles cited, but not a correspondingly large number of total citations. For example, 10 ''Psychology'' journals were cited, but only 11 articles total cited were from those 10 unique journal titles. Table 5 shows the top 15 most cited subject disciplines by frequency of article citations.

Cited publication types
Environmental health citation patterns appear to differ somewhat from other public health disciplines, particularly in terms of the ratio of serials to other items cited. In this study, 85.5% of citations were to journal articles. This is far more comparable with previous studies of hard sciences and health sciences disciplines, which have shown that anywhere from 77.0% (polymer science) to 93.0% (biomedical research and clinical medicine) of citations are to journal articles [11,12]. The greatest similarity lies with chemistry citation patterns (87.0% journal article citations in 2000) [12], overall natural science and engineering citation patterns (82.0%) [12], and neurosciences, another multidisciplinary field (82.0%) [13]. It should be noted, however, that Lariviere et al.'s study [12] relied solely on SCI, SSCI, and Arts and Humanities Citation Index data, which have been shown to underrepresent web and other miscellaneous citations [5]. Regardless, the percentage of cited journal articles is considerably higher than in public health in general and other public health specialties, which have reported 63.0%-66.0% of citations from journal articles [5,[14][15][16][17]. It is possible, though, that the general trend over time toward is increasing reliance on journal articles. Burtis and Taylor observed such a trend in the health education specialty. Their 2010 estimate of the proportion of journal articles  [17].
Citations to other publication types were fairly evenly spread between miscellaneous, books, and government documents in this study. Counting both miscellaneous and government document categories together, gray literature accounts for more than 11% of citations. Though this is not as high a percentage as in other public health citation analyses, it still shows that librarians working with environmental health practitioners and researchers should be familiar with some of the major sources of gray literature in this discipline. As shown above, many government agencies, in particular, create reports and other materials relevant to environmental health topics. Over 9% of environmental citations contained URLs or DOIs or were to eonly publications, a slightly higher percentage than was reported for public health in general for 2003-2005 [5], perhaps reflecting an increase in the popularity of the DOI for citation purposes.

Age of citations
Nearly half of all citations were 5 or fewer years old and slightly more than three-quarters were 10 or fewer years old, repeating a pattern shown in other citation analyses. Rethlefsen and Wallis found that 50.0% of all citations fell within the previous 5 years [5]; Schloman found that 66.2% of all citations were 9 or fewer years old (a sliding age scale was not used in this study, so the real percentage may be higher) [16]; and Rethlefsen found that 83.0% of citations were 10 or fewer years old [15], with a peak in citation between 2 and 5 years after publication. Small numbers of non-journal articles cited in this study, however, make comparisons of cited age by publication type with other studies suspect. Here, miscellaneous citations have the highest concentration of newer citations, and books did not tend to be significantly older than other materials; previous studies found books to be the oldest cited materials [5,16]. Two-thirds of citations to books and government documents were 10 or fewer years old, compared to a study in a similar discipline, toxicology, which found only 58.0% of book and government document citations cited within 9 years [19]. Likewise, two-thirds of journal articles cited were 10 or fewer years old.

Cited journals
A series of recent citation analyses of the Archives of Environmental and Occupational Health (formerly, Archives of Environmental Health), a Zone 1 title, gives some additional insight into citation patterns in environmental health and its related discipline, occupational health. As part of a larger analysis, Smith traced how the 5 most popularly cited articles published in the journal from 1975-2005 were cited over time [20]. For all 5 articles, citations peaked 8 years after publication, and most citations were received within the first 10-15 years. Individual articles' citedness varied widely, however; the 4 articles falling more squarely into environmental health received between 44.8% and 77.0% of all citations within 10 years of publication. The more occupational health-themed article received only 12.6% of total citations within 10 years [20]. Looking at the 7 most popularly cited articles from 1961 to 1974, Smith found that only 1 of the articles had more  Environmental health citation patterns than 50% of its citations within 10 years and that maximum citation density occurred between 10 and 15 years [21]. Interestingly, in tandem with the current study, this finding points to a heavier reliance on more current materials now than in the past. Older materials remain important, however. Libraries wishing to limit holdings by date should keep at least 15 years' worth of environmental health titles to ensure that most needed articles are available. This may not be as critical in the electronic age, where most current subscriptions to online journals would generally include the past 15 years of content. Purchasing back files may only be necessary for research-heavy institutions.
Conforming to Bradford's Law of Scattering, a very few journals (n516, 3.9%) in this study account for a third of all citations. This is similar to studies in overall public health [5,15] and public health-related disciplines, including tropical medicine [22], occupational health [23], and epidemiology [24]. Though few journals represent most citations and thus few journal titles may satisfy the needs of most environmental health collections in libraries, the number of cited journals (n5409) shows how far the literature of environmental health disperses. This dispersion may be increasing over time. Lariviere et al. note that the concentration of citations is decreasing over time; for example, in 2005, 33% of journal articles in medical disciplines accounted for 80% of all citations, versus 24% in 1990 [25].

Disciplines
Also similar to previous studies was the wide variety of disciplines represented by the cited journal articles. Gehanno and Thirion found that more than half of all occupational health journals were in medical or medical specialty titles [23]. Looking at epidemiology, Hasbrouck et al. also found that over half came from clinical medicine titles, noting that general medicine and oncology titles made up the bulk of those citations [24]. Environmental health, like occupational health and epidemiology, is a public health discipline shown to pull from a wide range of sources. Tarkowski, examining Europe's environmental health and occupational health research output, found that 711 journals published European environmental health research between 1995 and 2005, primarily focused on environmental exposures, pollution, and environmentally caused illnesses [4]. In 1992, McCunney et al. found that general medical journals increased their publication of environmental health articles from 1975-1990, particularly noticing an upswing in topics like radioactivity, water pollutants, and food contamination [3]. Indeed, in this study, 45% of citations were to natural and physical sciences journals, and 32% were to medicine and medical specialty journals. Because all subject designations listed in the SCI and SSCI were counted for each title, there was substantial overlap. Regardless, both medicine and natural sciences play an important part in the study of environmental health.
Here, Zone 1 titles include a general medical title (Lancet) and general science title (Proceedings of the National Academy of Sciences of the United States of America [PNAS]), as well as several public health titles without a specific environmental health focus. However, nearly all journals in Zone 1 are unique to environmental health when compared to previously published public health citation analyses [5]. Only three titles-American Journal of Epidemiology, American Journal of Public Health, and Lancet-appear on other Zone 1 lists. The Zone 1 list corresponds fairly closely with the Essential Core in the Core Public Health Journal Project's Environmental Health Sciences 2.0 list. Eight titles appear on both Zone 1 and the Essential Core list, with an additional three Zone 1 titles listed in the Research Level Core [7]. Not represented in either the Essential or Research Level Cores, but represented in Zone 1, are Environmental Science & Technology, here the fourth most commonly cited title; American Journal of Epidemiology; PNAS; Lancet; and American Journal of Public Health. All Zone 1 titles are fully indexed in MEDLINE, except for PNAS and Science of the Total Environment, both of which are selectively indexed in MEDLINE. PubMed picks up many additional citations from both titles through PubMed Central.
One Zone 1 title, Mutation Research, poses an interesting problem for collection development. Currently, a Mutation Research set subscription has four parts, but each individual part can be purchased separately by a library. PubMed, however, lists all articles from three of the parts (excluding DNA Repair) under the generic title, Mutation Research. Therefore, articles from both the older, pre-split version of Mutation Research and its successors are all cited in the same way, making it hard to determine which Mutation Research subscription should be purchased. It was beyond the scope of this study to determine from which part each cited article came, but the Core Public Health Journal Project's Environmental Health Sciences Essential Core recommends Mutation Research: Genetic Toxicology and Environmental Mutagenesis [7].

Limitations
One of the major limitations of this study is inherent in all citation analyses that try to establish core lists of journals for a specialty. As noted by Smith during his search for a ''core'' list of occupational health journals, no single method of establishing a core journals list can exist in isolation [26]. Authors have proposed many different methods: use of expert-reviewed lists, citation analyses using major citation index data, citation analysis by hand as in this study, in-house journal use, cost per library use, tracing of citation classics, and one of the most frequently used, impact factors or other ranking systems like SCImago [19,26,27]. Even similar methods may produce different results, especially when using expert review. For instance, one study of prevention research centers found that researchers named ninety-nine different Rethlefsen and Aldrich journals as influential in their field, thirty-four of which were named more than once [28]. Smith looked at ten different studies with ''core'' occupational health journals and found that only four titles overlapped on every study [26]. Indeed, one wonders if core journals really exist [26]; they certainly cannot be evaluated in a vacuum.
Using sampling instead of the complete list of citations has its own challenges. Though it is an established methodology [5,29,30], the smaller number of citations make delineations between Bradford zones tenuous, particularly between Zones 2 and 3. In this study, the researchers tried to overcome this by emphasizing the important subject disciplines for environmental health, as well as including Zones 1 and 2 lists. Nevertheless, it is possible that in a comprehensive study, journal titles may emerge in different zones. The mapping methodology, because it begins with one to five journals from which the citations are drawn, is by itself a sample of the representative literature in any given field. Using multiple methods for collection development, such as the Core Public Health Journal Project's expert-reviewed lists and local usage analyses, is important [31].
One of the journals analyzed in this study, EHP, has been published as an open access journal since 2004. It is uncertain whether or not EHP's open access status has any bearing on its influence in the field of environmental health. However, open access articles in the sciences have shown a consistent advantage over non-open access articles in terms of how often they are cited. Gargouri et al. found that the citation advantage could not be explained away by authors and journals making only the best quality articles available through open access. In fact, the citation advantage persisted whether an article's open access status was an individual decision or due to a policy mandate, such as that indicated by the NIH public access policy [32]. In the future, researchers should consider adapting the literature mapping methodology used here or in the original NAHRS protocol [9] to better assess the impact of open access publishing and public access policies on citation trends.

CONCLUSION
Environmental health, like the larger field of public health, spans a wealth of different disciplines. Unlike public health and many of its other subdisciplines with previously published bibliometric analytics, environmental health depends heavily on both the literature of the natural and physical sciences and medicine. This is especially noticeable in the far greater ratio of journal articles to other types of materials cited, which is quite similar to chemistry and other sciences. The subject disciplines represented by the cited journal titles further demonstrate this pattern, nearly half of cited articles came from, or cross-listed as, natural and physical sciences titles. The wide range of disciplines represents the multipronged nature of environmental health: the study of the physical environment (air, soil, water); its hazards and pollutants; and of course, the impact of both the physical environment and its hazards and pollutants on the human body's diverse systems.
Because the percentage of non-journal articles cited is relatively small, and even more so because a large percentage of those types of publications are freely available governmental publications, librarians supporting environmental health practitioners and researchers may wish to focus spending on serials. Monographs are very little used by this group. Librarians should, however, keep up to date with major publications from environmental health-related government agencies, both national and local, as well as consider cataloging important online resources and other gray literature.
To the researchers' knowledge, this is the only citation analysis done in the area of environmental health. Though environmental health literature has been studied previously, the research has been to determine productivity and reach [3,4]. It has also been studied in tandem with occupational health [20,33], which is a related, but distinct, discipline. For evidence of this, one need only look at the ''core'' journals established in Smith's two-part study of occupational health and medicine journals; only one of the five titles established as core to occupational health using multiple methodologies, Occupational and Environmental Medicine, is represented on the Zone 1 list for environmental health [26,27]. This study adds to the body of literature on public health information use and its subdisciplines. Environmental health literature reflects quite different patterns of information use than other areas of public health practice and research.