Access of primary and secondary literature by health personnel in an academic health center: implications for open access *

Purpose: The research sought to ascertain the types and quantity of research evidence accessed by health personnel through PubMed and UpToDate in a university medical center over the course of a year in order to better estimate the impact that increasing levels of open access to biomedical research can be expected to have on clinical practice in the years ahead. Methods: Web log data were gathered from the 5,042 health personnel working in the Stanford University Hospitals (SUH) during 2011. Data were analyzed for access to the primary literature (abstracts and full-text) through PubMed and UpToDate and to the secondary literature, represented by UpToDate (research summaries), to establish the frequency and nature of literature consulted. Results: In 2011, SUH health personnel accessed 81,851 primary literature articles and visited UpToDate 110,336 times. Almost a third of the articles (24,529) accessed were reviews. Twenty percent (16,187) of the articles viewed were published in 2011. Conclusion: When it is available, health personnel in a clinical care setting frequently access the primary literature. While further studies are needed, this preliminary finding speaks to the value of the National Institutes of Health public access policy and the need for medical librarians and educators to prepare health personnel for increasing public access to medical research.


INTRODUCTION
The prospect that new technologies and policies will allow health personnel to increase the use of research evidence in clinical practice creates the potential for improving the health care of individuals and populations [1,2] and facilitating lifelong learning for health personnel [3,4]. In the last decade, an international movement to increase free online access to journals across all disciplines, including medicine, has made a greater proportion of the scholarly literature freely available outside of universities and research institutions. Today, roughly 20% of journal articles are freely available within a year of publication [5]. Among the factors helping to increase access has been the creation and growth of open access scholarly publishers, such as BioMedCentral and Public Library of Science (PLoS). From Nature to the New England Journal of Medicine, major publishers are now also exploring open access options [6,7]. Additionally, open access policies that require authors to post public copies of their publications, have been instituted by universities, such as the University of California, San Francisco [8], and by major sources of research funds in the United States, United Kingdom, Canada, and Europe [9][10][11][12]. In 2008, the US National

Implications
N Policy revision is necessitated, as the NIH public access policy's current one-year embargo is incongruent with health professionals' access of information.
N Increasing public access to biomedical research will lead to more frequent consultation of evidence among health personnel.
N Medical librarians and educators need to prepare health personnel to work in clinical environments with increased access to research.
Institutes of Health (NIH) public access policy was enacted to ensure that the public, including health personnel, have access to NIH-funded research results within one year of publication [13]. The open access copies (which must be peer-reviewed final drafts, if not the published version) are deposited in PMC (previously known as PubMed Central), an archive operated by the US National Library of Medicine (NLM). PMC currently contains 2.5 million articles and is accessed by more than 700,000 users daily [14]. In 2011, the manuscripts of 250,000 publications were deposited in PMC [15], more than triple earlier annual deposit estimates [16,17]. The NIH also reports a 75% policy compliance rate on behalf of current NIH-funded authors and their publishers [15]. All indicators point toward a future, if still a good number of years off, in which the vast majority of materials in scholarly biomedical journals will be publicly available.
Given that these various efforts to increase free access to biomedical research have affected only a small portion of the literature to date, it is difficult to understand what universal free access to the published literature would mean for clinical practice [18,19]. The lack of data to suggest the extent to which health personnel will actually take advantage of universal free access to scholarly journal articles poses a challenge for librarians and educators as they try to prepare for this new environment. Additionally, it is hard to assess and defend the NIH public access policy without a means to judge the value of open access that it and other policies increasingly provide.
In general, studies of the information-seeking behavior of health personnel tend to use self-reported data. For example, researchers have reported that health personnel use a range of information resources [20], with a strong preference for PubMed and UpToDate [21], as well as Google [22]. Additionally, studies report that health care providers have a strong predilection for point-of-care resources to provide pre-appraised information [23,24]. Using a combination of survey and interview data, O'Keefe has reported similar findings and a desire by physicians to have access to primary research evidence for patient care [25].
Using a more objective approach, researchers have investigated direct use of PubMed via web log analysis. For example, Herskovic and Dogan each studied PubMed and described users' search queries, citation hit counts, and views of abstracts [26,27]. Both of these studies found high use of PubMed and provided valuable data related to user search patterns; however, they provided little or no data related to the types of literature accessed (e.g., review articles, clinical trials, etc.). Herskovic concluded that users' PubMed search habits resemble general web search strategies but acknowledged, as a limitation of his study, that he was unable to track click-throughs to article results in PubMed [26]. Dogan noted that the size of a set of results influences the decision of whether or not to click on results [27]. Although valuable, these findings relate to use of only a single information resource, PubMed, and thus might have limited value in the health care environment where, as noted above, a wide variety of information resources are typically used [20][21][22][23][24], including secondary literature resources such as UpToDate [2]. Additionally, these results are based on studies of short-term behavior, such as searches in a single day or month, and do not restrict results to health professionals [26,27]. Most importantly, these studies do not provide details on use of the primary literature; they do not identify article publication types, publication dates, and sources accessed. These data would be of considerable value for librarians in selecting materials for their collections and for policy makers prioritizing materials for public access.
The study reported here seeks to improve understanding of primary literature accessed by those engaged in clinical care. It examines the annual access patterns of health personnel in a university medical center where the biomedical research literature indexed in PubMed, as well as the secondary literature resource UpToDate, is easily accessible. While an academic medical center represents a special case of a clinical environment, one likely to be at the high end of expected clinical use of research, it nonetheless offers a useful starting point in assessing the extent to which clinicians access the research literature when it is readily available. Four questions were the focus of this study: (1) With what frequency do health personnel find and view abstracts and articles when they have relatively complete access to this literature? (2)  approximately 100 Stanford medical students were undertaking clinical clerkships at SUH during the period in which data were gathered. Non-health personnel, such as patients, are also present in SUMC hospitals and can request access to the SUMC network; however, this group's use of the network, and more specifically the studied resources, was believed to be negligible. Although the data collection techniques did not permit the access patterns of specific individuals to be identified, it seemed reasonable to conclude that health personnel were the primary users of biomedical information in the hospitals.
Individuals at SUH can gain access to the primary literature through Stanford's Lane Medical Library and Knowledge Management Center.{ To gauge the types and frequency of research literature accessed, this study collected and stored anonymized web logs from SUH web traffic generated physically within the hospitals on the Stanford University network and passing through Lane Library online systems from January 1, 2011, through December 31, 2011. Data were collected for usage of Lane's online systems performed on hospital computers (both desktops and laptops), which are available throughout both facilities and are connected to the Stanford network. Data usage performed on mobile devices used within the hospitals, from outside of SUH or which bypassed the Lane Library website (in particular direct access of PubMed), were not included.
While on the Stanford network at SUH, individuals are able to connect with Lane Library's electronic resources from Lane Library's web pages, which include several clinically focused specialty information resource portals, and from within SUH's electronic health record system. All of these entry points to Lane Library resources were tracked for the study. Since the focus of the study was access to online resources, no attempt was made to measure access to print literature or to track efforts to obtain unavailable full text of primary literature by using services such as interlibrary loan.

Information resources evaluated
PubMed and UpToDate were selected for study based on their prevalence in clinical care [30,31], their ability to provide health personnel with research evidence, and their high level of use at SUH. PubMed is a free search interface providing access to MEDLINE, NLM's premier bibliographic database, and containing references to more than 21 million articles in the life sciences, with a concentration on biomedicine [32]. In 2009, PubMed was searched over 600,000,000 times [33]. UpToDate provides clinically focused biomedical research summaries that are written by physicians and referenced with links to primary research articles. UpToDate has more than 600,000 users worldwide [34] and is available in 17% of US hospitals [35]. These resources are heavily used at Stanford as well. For example, in 2011 UpToDate was accessed more than 100,000 times, whereas Clin-eGuide, Clinical Evidence, and Five-Minute Clinical Consult were only accessed approximately 2,000 times combined. Traffic to the Google and Google Scholar search tools via the Lane Library was also initially selected for review, because of the general popularity of these resources [36]. However, traffic reported through the library web logs for these 2 resources was minimal; it accounted for less than 2% of total visits and clicks to primary literature. This may be because health personnel can access these resources directly rather than going through the library's website.

Measures
UpToDate and PubMed both utilize a unique identifier (the PubMed ID, or PMID) to link out to primary literature. Using this identifier for articles retrieved via PubMed and UpToDate, we were able to identify detailed bibliographic information for each article accessed, including year of publication, publication type, journal title, and article title ( Figure 1). By identifying each article's publication types, we were also able, when available, to determine governmental funding of an article. Article information was extracted using the National Center for Biotechnology Information's Entrez programming utilities [37] on March 15, 2012, when the majority of articles accessed would most likely be indexed for MEDLINE.
Usage data were collected via a proxy server, which permitted the isolation of hospital traffic. The proxy server recorded a user's session in a standard web log format for later processing and analysis. These web logs contained a unique identifier for each session, but no user-specific identification. Information requests, filtered from proxy logs, generated the following data: a timestamp, a unique session string, the requested resource, and referring page. Using this information, overall use of PubMed and UpToDate could be tracked using the appropriate entrance uniform resource locator (URL) associated with a session and a timestamp. Session data collection ended when a participant closed the browser or the browser was idle for more than sixty minutes. All data extracted from Access of primary and secondary literature proxy logs were stored in a relational database for reporting purposes.1 Three measures were used to determine use and access: (1) visits to PubMed and UpToDate, (2) views of abstracts, and (3) views of primary literature articles. A ''visit'' was defined as a user clicking on a link to enter PubMed or UpToDate. Multiple clicks to enter the same resource within a single user session were counted as a single visit. An abstract ''view'' was defined as a click in PubMed of an article title and in UpToDate as a click on a reference link. A ''view'' of a primary literature article was defined as a click on a full-text link. In UpToDate, full-text links to literature were labeled ''Check for full text availability'' or ''PubMed.'' In PubMed, links to literature took the form of a publisher icon or an OpenURL link resolver button located beside the abstract view of the article. In both PubMed and UpToDate, literature links contain PMIDs, which can be used to identify the specific article accessed. Multiple clicks to the same article within a single user session were counted as only a single view, as is commonly accepted in web log analysis [38].

Analysis
We compiled descriptive statistics (views, publication date, publication type, journal title, and article title) of the articles accessed.

RESULTS
In 2011, SUH health personnel visited UpToDate and PubMed a total of 157,580 times (Table 2). UpToDate was visited 110,336 times, more than twice as often as PubMed (47,244 visits). A visit to PubMed typically led to viewing 2.69 abstracts and 1.68 research articles. Visits to UpToDate led to clicking through its hyperlinked bibliographies to the primary research literature only 2,474 times, accounting for only 3% of the 81,851 research articles viewed during the year from these 2 platforms. In some cases, articles were viewed more than once. For example, 6 articles had 20 or more viewings. The most commonly viewed article (34 views) was an early 2011 release of a case report on heart disease [39], which was available from Elsevier's Science Direct website to non-subscribers for $31.50. Only 1 of these 6 articles (16%)-a clinical trial (21 views) [40]-was freely available.
Among the viewed articles, the most popular type was the review, which summarizes and draws conclusions from a wide range of research on a single topic (Table 3). We examined views of US government-funded articles for articles published between 2008 (the start date of the NIH public access policy) and 2011. SUH personnel viewed 6,917 articles (8%) funded by US government agencies and published in this time period. Of these articles, 5,763 (7%) were funded by NIH, and 1,154 (1%) were not funded by NIH (Table 5), but instead funded by other government agencies such as the Departments of Education, Energy, Defense, and Justice.
The vast majority of viewed articles were published within the past decade, with 16,187 (20%) falling within the current year of the study (Table 6). Articles were accessed from 4,967 journals. Thirty-nine journal titles accounted for the first quartile of article views, the second quartile consisted of 137 titles, the third quartile included 375 journal titles, and the fourth quartile 4,407 titles. Table 7 (online only) has the first quartile of journals.

DISCUSSION
Despite the frequently mentioned barriers to accessing evidence in clinical settings-lack of time [4] and 1 This study focused on describing access to primary and secondary information resources and did not attempt to measure the utility of the information accessed or its application to patient care. poor information retrieval skills [41]-SUH health personnel took frequent advantage of the published literature. Although UpToDate was used more frequently than PubMed, which is consistent with previous studies [25,42,43], health personnel did access primary literature using both PubMed and UpToDate, indicating that UpToDate alone is not sufficient to support information seeking for patient care. Of course due to their inherently different content and structure, it is not possible to directly compare visiting UpToDate to viewing a scholarly article. This study attests, however, to the degree to which both UpToDate and the primary literature form integral parts of clinical practice among health personnel. It indicates that health personnel find having access to both primary and secondary sources of considerable value, which in turn has implications for library collection development and the training of health personnel in information literacy skills.
SUH health personnel accessed 57 of the 151 publication types assigned by NLM, suggesting that access to a wide range of publication types is desirable in patient care. Twenty-three percent of the views were to article types associated with the practice of evidence-based medicine (EBM) (clinical trials, comparative studies, and meta analysis), indicating that article selection might be influenced by the desire to practice EBM. EBM is the ''conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients'' [44] and is linked with the promotion of individualized care and best practices [1]. Current best evidence has been defined as ''clinically relevant research, often from the basic sciences of medicine, but especially from patientcentered clinical research'' [44], making the use of clinical trials, particularly randomized control trials, a popular choice for practicing EBM [45].
More than 10,000 clinical trials were accessed, with close to 50% (4,753) of those qualifying as the goldstandard randomized control trials. Clinical trials allow the reader to home in on a single factor, whether in treatment or patient characteristics. In 2011, NIH invested $3.5 billion to fund clinical trials research [46,47]. Clearly, the value of these trials to health personnel, and thus the public at large, is increased as this research is accessed in clinical settings.
The preponderance of review studies among the viewed articles (29% of the articles selected) suggests that clinicians need review articles despite easy access to summaries and reviews of the literature in UpToDate.
Our findings also demonstrate that health personnel access articles funded by US government agencies other than NIH. This speaks to the importance of legislative initiatives, such as the recently introduced Fair Access to Science and Technology Research Act, intended to extend the NIH public access policy to other federal agencies that sponsor research [48].
The observed multiple views of articles suggest small pockets of common interest in a given study among the health personnel. As noted, of the 6 articles that were viewed more than 20 times, only 1 (16.0%) was available to the public without a paid subscription. This roughly approximated the overall state of public access to the scholarly literature, which was calculated to be just over 20.0% for the entire scholarly output and 18.6% specifically for medicine in 2009 [5].
Twenty percent of the articles accessed by health personnel in 2011 were published in 2011, supporting previous findings that health personnel tend to access the most current literature in patient care [49]. The NIH public access policy, which allows publishers to impose a one-year delay between publication and public access through PMC, could be problematic given this policy and create a disparity between health personnel who work in medical centers affiliated with a well-funded library and those who do not. Since the adoption of the NIH public access policy in 2008, there have been calls for adjusting this requirement from twelve to six months to match the pace of scientific research and the policies of the European Research Council [10], the Wellcome Trust  The results of the current study strongly support reducing this waiting period to six months or less. That 20% of the accessed journals accounted for 88% percent of the accessed articles somewhat aligns with the 80/20 principle, which has been adopted as a means of better understanding the utilization of library collections [50]. However, the data collected here also suggest that a wide range of journals are consulted by those involved in clinical care, including journals from the professional domains of medicine, nursing, social work, and occupational health. Similar to reported reading habits of researchers, these health personnel accessed a wide range of works, based on search strategies and targeted areas of interest [51]. Our study results underline how important it is for libraries to provide access to highly cited titles, but the results also indicate the need to maintain a broad library collection to satisfy the diverse needs of health personnel.
As exemplar stewards of knowledge [52], librarians routinely monitor information access by surveying patrons, tracking online journal use, and so on to inform the maintenance and the evolution of library collections. Although valuable, these data are generally at the resource level and lack the granularity of the article-level access that is essential to evaluating the impact of science policy, such as the NIH public access policy. Therefore, we suggest that librarians consider modifying current practices to examine information use at the article level via web logs to better understand patron use habits and to inform science policy based on solid evidence. For example, use of web logs enables the determination of the publication dates of accessed articles, which can inform policy makers' decisions related to open access embargoes. Additionally, web log analysis can find information about governmental funding sources, which can inform the spread of public access policies to government agencies beyond NIH.

Limitations of the study
The data analyzed in this study represent traffic through UpToDate and PubMed via Lane Library on the SUH network. Because health personnel's access to information that did not originate from or pass through Lane Library's online resources or that originated outside the hospital was not counted, article views may be underreported. Future studies should consider designing and implementing mechanisms to capture a broader array of user data.
This study focuses on health personnel accessing research evidence and could not draw conclusions as to the amount of reading done or to the utility of the accessed information for application to patient care. For example, although the collected data indicated that a particular clinical trial was accessed, we were unable to determine if that study was read or applied in any fashion to patient care. Additionally, the methodology was unable to provide records with regard to the access of individuals or groups, making it impossible to identify if perhaps a single user or a particular group was responsible for particular information access patterns. As a further step in this direction, several of the authors of this study are undertaking a randomized controlled trial with physicians to assess changes in their informationseeking behaviors when provided with relatively complete access, leading to more detailed analysis of the value and contribution this information makes to their clinical care [53].
As a limitation of the study's methodology, it was not possible to determine whether or not an article was in PMC, and therefore freely available, at the time of its access. This inability stems from the fact that the study retrospectively queried for article metadata to determine the presence of PMCIDs. For example, an article accessed in January 2011 might or might not have been in PMC at that time, even though a PMCID was present in March 2012. Future researchers should consider this a factor in their study designs and check for the presence of PMCID immediately after access so as to be able to accurately determine the PMC status of the accessed article.  In addition, the results may overestimate article views for clinical care, as use sometimes may have been for research purposes instead of patient care. However, SUH's mission statement begins by asserting its dedication to the care of patients [28], which we propose would be similar to the missions of most health care facilities. Related to health personnel's tendency to access current information, it is important to consider the potential impact of PubMed's design, which by default is set to return current results first. This default may bias use toward more recent literature.

CONCLUSION
At SUH, health personnel use PubMed and UpToDate to varying degrees, with UpToDate being used almost more than twice as often as PubMed in clinical care. Although both resources serve as gateways to journal articles, UpToDate was rarely used in this capacity, whereas PubMed was frequently used to access journal articles. When accessing journal articles from either resource, health personnel tended to access review articles and clinical trials. Accessed journal articles also tended to be published within the last ten years and were drawn from a large number of journals across a variety of disciplines. These findings have implications for library collection development and science policy, specifically related to embargo periods, such as those included in the NIH public access policy. While there are limitations to this study, this is a first step in investigating information access by health personnel. We hope that it will inspire other similar assessments of information use to build a more reliable estimate of health personnel access to the biomedical literature in order to inform science policy and collection development.