![]() | ![]() |
Formats: |
||||||||||||||||
Copyright This is an Open Access article: verbatim copying and redistribution of
this article are permitted in all media for any purpose Cataloging On-Line Health Information: A Content Analysis of the NC Health
Info Portal 1 School of Information and Library Science, University of North Carolina, Chapel Hill 2 Lineberger Cancer Center, University of North Carolina, Chapel Hill Address for correspondence For additional information, contact Catherine Blake at Email: cab-lake/at/email.unc.edu Abstract The unrelenting increase of health information on the World Wide Web has
resulted in an urgent need for portals that provide consumers with trustworthy
health information. In response to this need, the National
Library of Medicine initiated the Go Local initiative, which extends MedlinePlus by providing consumers with links
to local health services, programs and providers. NC Health Info (www.nchealthinfo.org) is the first NIH funded Go Local portal. Our goal is to gain insight into the nature of interactions that
occur during the cataloging process of online health information resources. We
conducted a content analysis of annotations made by catalogers
on the NC Health Info portal between January 2000 and September 2004. Our
analysis of 2369 online information resources revealed challenges
with establishing the navigational, geographical and topical content
of an on-line resource. Our analysis provides insights into the mechanisms
that catalogers use to overcome those challenges and thus will
be of value to future Go Local portal development. Keywords: Consumer Health Information, Go Local, NC Health Info, MedlinePlus, Annotation, Cataloging online material Introduction Distributing information using the World Wide Web has never been easier. Although
access to information can empower a consumer to make informed
choices regarding their health care, the quantity of information often
leaves consumers feeling inundated. The National Library of Medicine (NLM) plays an active role in the provision
of health information to consumers. Medline-Plus , which was launched
in October 1998, typifies the NLM’s commitment to providing
trustworthy, well-organized health information [1]. The MedlinePlus critieria includes (1) the quality, authority
and accuracy of content; (2) the primary purpose of the Web page (i.e. educational
and not to sell a product or service), (3) the availability
and maintenance of the Web page and (4) special features, for example
providing content that is accessible to persons with disabilities1. The Go Local intuitive (www.nlm.nih.gov/medlinplus/golocal.html) augments information in MedlinePlus with local health services, programs, and
health care providers. The first Go Local portal was funded in August 1999 at the University of North Carolina, Chapel
Hill as a joint project between the Health Sciences Library and
the School of Information and Library Science. The next portal was created
at the University of Missouri, and since then Go Local portals have been initiated in Alabama, Arizona, California, Indiana, Maryland, Massachusetts, Michigan, Ohio, Texas, Tribal Four Corners (Arizona, Colorado, New
Mexico and Utah), and Wyoming. Several studies have explored the trustworthiness of health information
on the WWW [2–4]. A cataloger on the NC Health Info portal fulfills two roles. The
first is to ensure that the information resource (which we refer
to as a web page) is trustworthy. Figure 1
In addition to trustworthiness, the catalogers on the NC Health Info portal
assign terms from the Go Local controlled vocabulary. Although several projects have developed search
engines specifically designed for health information, the controlled
vocabularies in both MedlinePlus and the Go Local portals still play an important role in enabling a user to identify health
information on the web. In this paper, we characterize the communication patterns that occur between
catalogers as they assign terms from the Go Local controlled vocabulary to each web page that satisfies the quality criterion
shown in Figure 1 Materials and Methods The NC Health Info project provided a snapshot of the trustworthy web pages
that their team of catalogers had collected between 1 January, 2000 and 29 September, 2004. Of particular interest is the “note” field
in the database that the catalogers began using in April 2002. Each
web page has one or more notes, which we also refer to as
an annotation2. Annotations are either substantive messages between catalogers, or non-substantive
system messages; i.e., these are meant as part of the cataloging
process rather than for the public. Figure 2
The example database records shown in Figure 2 In general, the kappa statistic is used to report inter-rater reliability [5]; however, studies of on-line web pages have shown that researchers
rarely provide inter-rater reliability [3]. The example in Figure 2 In contrast to the kappa statistic, a qualitative analysis can provide
insight into the challenges faced by catalogers as they assign terms from
the Go Local controlled vocabulary to each web page that satisfies the quality criterion. We
used content analysis to characterize the nature of disagreements, such
as the discussions shown in Figure 2 Table 1 captures the content, format and function facets that we considered during
the content analysis. These facets and categories were developed
from our initial pilot study that comprised a random sample of 464 web
pages (20%) from the web pages in the NC Health Info database. Removing
non-substantive annotations yielded 371 substantive messages. Two
of the authors (LL and DW) characterized each of those 371 annotations, and Table 1 shows the categories that emerged. Once inter-rater reliability was established
between the two authors, they labeled the remaining pages using
these eleven categories shown in Table 1. The categories in each facet are not mutually exclusive, thus any given
substantive annotation can have multiple categories assigned. Every
annotation has at least one category for each facet.
The annotations for web page 46 (shown in Figure 2 Results Catalogers added 2788 distinct web pages to the NC Health Info portal between
Jan 2000 and Sept 2004. The following content analysis includes 2369 of
the web pages. There were 10,462 annotations for these 2369 pages, of
which 2301 (22%) captured interactions between catalogers. The
number of substantive annotations per web page ranged between
one and eight. For example, there are six substantive annotations associated
with the web page shown in Figure 2 Content Analysis – Pilot Study The content analysis from the pilot study indicated that most of the annotations
related to establishing the topical scope of a website (n=192) and
website navigation (n=147). A large number (n=266) of
annotations took the form of a statement while 109 were
posed as questions and 97 as answers. As for functions, 99 logged the
cataloger actions, 29 were reminders for the cataloging team, 181 were
requests for other catalogers to take an action or provide information, and 174 were
messages exchanging ideas and reaching consensus on solving
a particular problem arising from the cataloging process. The pilot study indicated that 97 of the annotation fields comprised at
least one round of discussion with regard to properly cataloging the
website. Such consensus building is necessary to avoid low levels of inter-rater
reliability with respect to the final catalog decision. This
finding suggests that software tools that support collaboration between
catalogers would enable catalogers to reach consensus in on- or off-line
environments. Content Analysis – Full Study The content, format, and function facets capture the nature of the interactions
between catalogers who worked on the NC Health Info portal during
Jan 2002 through Sept 2004. We consider only entries after 2002, where
the catalogers first started to use the optional annotation field. Figure 3
“This site contains links to many services on their home-page. Because
we have separate records for “Birthing Center” and “Rehabilitation Services”, I think separate records
for these other services should be created - especially since some
services encompass both Harris Regional AND Swain County Hospital. What
do you think? Then, I’ll change this record to “Hospital - Health
facilities.” The content facet also captured the cataloger’s decisions regarding
the geographical scope of an online resource (n=365, 14.8%). For
example, on May 30, 2002 a cataloger stated that “they
say they treat patients from western North Carolina...how do
I capture that for the county? …”. The remaining 229 (9.3%) of
the 2471 content annotations referred to miscellaneous
content (the frequency differs between facets because the categories
in each facet are not mutually exclusive). Figure 4
Figure 5
Figure 5 Discussion Although catalogers were not required to annotate their decision making
process, more than half of the web pages in the NC Health Info portal
had at least one annotation (1263 out of 2369). This suggests that catalogers
found annotating web pages useful during the cataloging process, and
leads us to recommend the inclusion of annotation in systems designed
to support this important user population. In order to understand if annotation behavior of catalogers changed over
time, we compared the number of annotations for six different catalogers
from April 2002 to September 2004 (data not shown). Our intuition
behind this analysis was that the number of annotations made by an individual
cataloger would decrease as their familiarity with the cataloging
process increased. Such an analysis might indicate the time required
to train a new cataloger. Contrary to our expectations experienced
catalogers continued to provide annotations throughout the cataloging
process. Annotations enabled the catalogers to form consensus around the meaning
of an existing information source, an activity that is not new in medicine. Scientists
who conduct systematic reviews develop extraction worksheets
that capture their consensus building activities [6]. Multiple reviewers independently extract information using the
guidelines, then resolve differences. This process enables the group
to establish group norms and verify the accurate extraction of information
from each article. Similarly, consensus building is an important consideration in recent efforts
in bioinformatics to annotate scientific articles with terms from
the gene ontology (www.geneontology.com). In both the systematic review and bioinformatics examples scientists
have developed hierarchies of evidence that reflect the annotator’s
confidence in the final category assignment. In a systematic review, the
stated study design reflects the level of evidence while in
bioinformatics, scientists have invented a set of evidence codes3 including “inferred from assay” and “inferred
from genetic interaction” to measure their confidence. The annotations
provided by the NC Health Info catalogers also reflect the cataloger’s
confidence in the final annotation. This serves as a
surrogate for levels of evidence in this new area of web page annotation
until accepted levels of evidence are developed. The large number of annotations related to topical scope suggests that
additional conversations are required to define the boundary of an online
information resource. In this paper, we have hidden the complexity
regarding annotating online information resources, by referring to each
resource as a web page, which implies an individual page. However, catalogers
do not catalog every individual web page on a site; rather
they catalog an entire site, or a sub-set of pages within a site. The
number of annotations indicates that defining these boundaries is very
challenging. The format facet provides insight into the nature of interactions that
takes place between catalogers. A statement does not require an immediate
response, but a question suggests that an interactive dialogue between
catalogers is eminent. We are currently extending our analysis to
explore interactions between catalogers. Conclusion This analysis is the first to explore the nature of interactions that occur
between catalogers as they manually add terms from the Go Local vocabulary to online health information. The content analysis of 2301 annotations
revealed that catalogers discuss the topical (n=1165), and
navigational scopes, (n=712), of a web page more frequently
than the geographical scope (n=365). Annotations were most
often in the format of a statement (n=1528) rather than a
question (n=467) or an answer (n=384). Catalogers made
annotations as reminders to themselves or other catalogers (n=1102), to
reach consensus (n=671) to log an action (n=546) and
to issue a request (n=145). Two of the challenges faced by catalogers are specific to an on-line environment. The
first concerns the web page boundary, for example when
should the cataloger assign a topic to an entire web site, and when should
they assign topics to subdomains? The second issue concerns the dynamic
nature of on-line information compared to traditional information
resources. Currently catalogers review web pages in the NC Health Info
portal every six months. However, a cataloger need not review an unchanged
web page, and should conduct a review if the page has changed
within the six-month period. Thus, an information system that detected
change within a web page would aid in the allocation of resources for
the review task. None of the individuals who worked on the NC Health Info project was required
to annotate their cataloging process; yet they provided annotations
for more than half of the web pages. This suggests catalogers find
annotations useful. Many of the catalogers on this project were students, so
these annotations have long-term implications with respect to
preserving organizational memory. The NLM established the GoLocal initiative to provide consumers with information about health services, programs, and
health care providers in their local community. As health
care providers and health care consumers continue to use the online
environment to disseminate and access information, the need for portals
that provide high quality information will also increase. Studies
such as ours, which characterize the challenges faced during the cataloging
process, are the first step towards the development of information
systems that support these important user communities. Acknowledgments Thanks to NC Health Info project for providing the data for this study; particularly, D.Duffie, C.Silbajoris and V.Ellington for earlier discussions
and to B.Hilligoss for providing technical support. This work
is supported in part by a gift from Microsoft. Footnotes 1Complete criteria are available from www.nlm.nih.gov/medline-plus/criteria.html 2This research was sponsored conducted as part of the Annotation of Structured
Data research team in the School of Information and Library Science
at the University of North Carolina at Chapel Hill (ils.unc.edu/annotation). 3A complete list of evidence codes are available from the Gene Ontology
URL http://www.geneontology.org/GO.evidence.shtml References 1. Miller N, Lacroix E, Backus J. MedlinePlus:building and maintaining the National Library of Medicine’s
consumer health Web service. Bulletin of the Medical Library Association. 2000;88(1):11–7. [PubMed] 2. Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information
on the Internet: Caveant lector et viewer--Let the reader and viewer
beware. JAMA. 1997;277:1244–5. [PubMed] 3. Gagliardi A, Jadad AR. Examination of instruments used to rate quality of health information on
the internet: chronicle of a voyage with an unclear destination. BMJ. 2002;324:569–73. [PubMed] 4. Eysenbach G, Powell J, Kuss O, Sa E-R. Empirical Studies Assessing the Quality of Health Information for Consumers
on the World Wide Web A Systematic Review. JAMA. 2002;287(20):2691–2700. [PubMed] 5. Sagaram S, Walji M, Meric-Bernstam F, Johnson C, Bernstam E. Inter-observer
Agreement for Quality Measures Applied to Online Health Information. In: MEDINFO 2004; 2004; 2004. p. 1308–12. 6. Alderson P, Green S, Higgins JPT, editors. Cochrane Reviewers’ Handbook 4.2.2 [Updated March 2004]. Chichester, UK: John
Wiley & Sons, Ltd; 2004. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Bull Med Libr Assoc. 2000 Jan; 88(1):11-7.
[Bull Med Libr Assoc. 2000]JAMA. 1997 Apr 16; 277(15):1244-5.
[JAMA. 1997]JAMA. 2002 May 22-29; 287(20):2691-700.
[JAMA. 2002]BMJ. 2002 Mar 9; 324(7337):569-73.
[BMJ. 2002]