• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of bmjLink to Publisher's site
BMJ. Mar 9, 2002; 324(7337): 577–581.
PMCID: PMC78995

Breast cancer on the world wide web: cross sectional survey of quality of information and popularity of websites

Funda Meric, assistant professor,a Elmer V Bernstam, assistant professor,c Nadeem Q Mirza, research investigator,a Kelly K Hunt, associate professor,a Frederick C Ames, professor,a Merrick I Ross, professor,a Henry M Kuerer, assistant professor,a Raphael E Pollock, professor,a Mark A Musen, associate professor,b and S Eva Singletary, professora

Abstract

Objectives

To determine the characteristics of popular breast cancer related websites and whether more popular sites are of higher quality.

Design

The search engine Google was used to generate a list of websites about breast cancer. Google ranks search results by measures of link popularity—the number of links to a site from other sites. The top 200 sites returned in response to the query “breast cancer” were divided into “more popular” and “less popular” subgroups by three different measures of link popularity: Google rank and number of links reported independently by Google and by AltaVista (another search engine).

Main outcome measures

Type and quality of content.

Results

More popular sites according to Google rank were more likely than less popular ones to contain information on ongoing clinical trials (27% v 12%, P=0.01 ), results of trials (12% v 3%, P=0.02), and opportunities for psychosocial adjustment (48% v 23%, P<0.01). These characteristics were also associated with higher number of links as reported by Google and AltaVista. More popular sites by number of linking sites were also more likely to provide updates on other breast cancer research, information on legislation and advocacy, and a message board service. Measures of quality such as display of authorship, attribution or references, currency of information, and disclosure did not differ between groups.

Conclusions

Popularity of websites is associated with type rather than quality of content. Sites that include content correlated with popularity may best meet the public's desire for information about breast cancer.

What is already known on this topic

Patients are using the world wide web to search for health information

Breast cancer is one of the most popular search topics

Characteristics of popular websites may reflect the information needs of patients

What this study adds

Type rather than quality of content correlates with popularity of websites

Measures of quality correlate with accuracy of medical information

Introduction

Recent surveys show that 40-54% of patients access medical information via the internet and that this information affects their choice of treatment.15 Although the quality of medical information on the world wide web has been an area of increasing concern,611 the factors that contribute to popularity of websites have not been systematically studied.

Understanding the determinants of website popularity has implications for clinicians and medical centres that recognise the need to provide information about themselves via the internet. Website designers who understand the information needs of the public can attract visitors to their site. Knowing what patients are investigating on the web may help clinicians to educate themselves and their patients.

Two measures of website popularity are “click popularity” and “link popularity.”12 Click popularity is the frequency with which users have visited (clicked on) a site.13 Although some search engines, such as Direct Hit, measure click popularity, this information is not publicly available for a large number of websites. Furthermore, click popularity is subject to artificial marketing manipulations.14 Link popularity, which is less susceptible to manipulation,15 relies on links from sites to other sites rather than on statistics about usage. High link popularity is thought to dramatically increase traffic to a site.16 Link popularity, sometimes referred to as “peer review popularity,” has been proposed as an objective way of identifying high quality websites.1719 Google ranks results of searches by using a proprietary link popularity algorithm that takes into account the number of links and the “importance” of the linking sites.15,17

Breast cancer is one of the most common health related search topics among users of the internet.20 Previous studies have evaluated use of the internet by women with breast cancer and the quality of selected sites.10,21,22 A recent study found information about breast cancer on the web to be more complete and accurate than for other conditions.22 We are not aware, however, of work that attempts to determine what makes some sites more popular than others. The purpose of our study was to identify the determinants of link popularity of websites about breast cancer and to test the hypothesis that more popular sites are of higher quality.

Methods

Selection of websites

We used the search term “breast cancer” on Google (www.google.com accessed 19 Oct 2000) to generate a list of sites. We examined the first 200 of approximately 600 000 English language sites. Of these, 185 (93%) were accessible, but one was excluded as its content was only peripherally related to breast cancer.

Determination of popularity

Because there is no standard way to assess link popularity, we used three different measures: Google rank and number of links reported by Google and by AltaVista (on www.altavista.com). Of the top 200 sites returned by Google, we defined the first 100 (Google rank 1-100) as “more popular” and the second 100 (Google rank 101-200) as “less popular.” We obtained the number of links in Google and AltaVista by entering each site's universal resource locator (URL) into the search string “link:URL”.

Google provided the number of linking sites for 162 sites and AltaVista for 148 sites. We excluded from analysis any sites for which the number of links was not available. The median number of links was 51 according to Google and 21 according to AltaVista. We considered sites with a number of links greater than the median to be “more popular” and sites with fewer links to be “less popular.” To assess whether popular sites were displayed by multiple search engines, we repeated the search on four search engines often used by patients: Yahoo (categorical), Excite, AltaVista, and Infoseek.4

Evaluation of websites

A breast oncologist (FM) evaluated the sites within four weeks of the original search. Links within each site were pursued until all medical information about breast cancer was evaluated. A median of four pages (range 1-11) were evaluated for each site. Type and quality of content were recorded. Affiliation was determined on the basis of the information provided by the site. Sites were divided into professional (government, universities, major medical centres), non-profit organisation, and commercial (all others).

We assessed quality of content by criteria known as the “JAMA benchmarks”6: display of authorship of medical content; source (attribution or references); date of update; and disclosure of ownership, sponsorship, advertising policies, or conflicts of interest. We also documented whether each site displayed its webmaster's email address or a Health on the Net (www.hon.ch/) seal. Health on the Net is a non-profit foundation with an eight point code of conduct for sites providing health information.23 Sites that comply with the Health on the Net code are allowed to display the seal, but continued compliance is not systematically enforced.

Analysis

We used Pearson χ2 analysis to compare more popular and less popular websites. We performed separate analyses for each of the three measures of link popularity. We considered groups to be significantly different if P[less-than-or-eq, slant]0.05 in at least two of three analyses.

Results

Website characteristics

Table Table11 lists the characteristics of the top 184 accessible sites returned by Google. Twenty seven (15%) provided medical facts on the site as well as through links to other sites, and 125 (68%) had medical facts displayed at the website only. Table Table22 shows the medical facts displayed in this second group of websites.

Table 1
Characteristics of breast cancer websites evaluated (n=184). Values are numbers (percentages)
Table 2
Medical facts contained in breast cancer websites (n=125).* Values are numbers (percentages)

Table Table33 shows indicators of quality. Of the 184 sites, 105 (57%) displayed some evidence of authorship, but only 32 (17%) displayed the name, qualifications, and institutional affiliation of the author. Sixteen (9%) of sites had all four JAMA benchmarks (authorship, references, currency, and disclosure), 48 (26%) had three, 68 (37%) had two, 43 (23%) had one, and 9 (5%) had none. Forty five sites (25%) displayed a disclaimer that the information provided should not substitute for consultation with a physician.

Table 3
Quality of medical content (n=184). Values are numbers (percentages)

A Health on the Net seal was displayed on 27 (15%) sites. Commercial sites were more likely than sites of professional groups or of organisations to display the seal—21/84 (25%) v 3/36 (8%) v 3/64 (5%) (P=0.001). None of the sites with a seal actually complied with all eight Health on the Net criteria or with all four JAMA benchmarks.

Of the 184 sites, 12 (7%) contained inaccurate medical statements. Commercial sites contained inaccurate statements more often than did sites of professional groups or of organisations—11/84 (13%) v 1/36 (3%) v 0/64 (P=0.004). Three (16%) of 19 commercial sites that displayed the Health on the Net seal contained inaccurate statements. Higher quality sites (at least three JAMA benchmarks) were less likely to contain inaccurate information than lower quality sites (fewer than three JAMA benchmarks)—1/64 (2%) v 11/120 (10%) (P=0.047) (figure). None of the 16 sites that met all four JAMA benchmarks contained inaccurate information.

Determinants of popularity

Type of content differed significantly between more popular and less popular websites (table (table4).4). Sites that were more popular by at least two of three popularity measures were more likely to contain information about ongoing clinical trials, results of randomised clinical trials, results of other breast cancer research, information on legislation and advocacy, and information on opportunities for psychosocial adjustment and to allow interaction through a message board service.

Table 4
Content of breast cancer websites by popularity. Values are numbers (percentages) unless stated otherwise

We then evaluated differences in the topics of medical facts presented between the more popular and less popular sites (table (table2).2). This analysis was carried out only for the 125 sites that displayed medical information. More popular sites were more likely to discuss breast reconstruction—15/57 (26%) v 8/68 (12%) by Google rank (P=0.037), 15/51 (29%) v 5/57 (9%) by number of links in Google (P=0.002), and 16/50 (32%) v 6/48 (13%) by number of links in AltaVista (P=0.018)—and psychology topics such as depression—11/51 (21%) v 1/57 (2%) by number of links in Google (P=0.001) and 11/50 (22%) v 1/48 (2%) by number of links in AltaVista (P=0.002).

More popular and less popular websites did not differ in any of the quality measures studied (table (table4).4). Furthermore, the presence of inaccurate information did not differ between more popular and less popular sites.

Evaluation of popularity measures

We evaluated the concordance between our measures of popularity. The median number of linking sites as measured by Google and AltaVista was significantly higher for sites that were more popular by Google rank than for less popular sites (Google: 82 v 21, P<0.001; AltaVista: 48 v 10, P<0.001). The number of links as measured by Google strongly correlated with the number of links as measured by AltaVista (Pearson coefficient 0.806, P<0.001).

We hypothesised that link popularity would correlate with a site being displayed by multiple search engines. Of the top 184 accessible sites displayed by Google, AltaVista displayed 24 (13%), Yahoo displayed 41 (22%), Infoseek displayed 58 (32%), and Excite displayed 84 (46%). Indeed, more popular sites were more likely than less popular sites to be displayed by multiple search engines (table (table5).5).

Table 5
Display by other search engines on the basis of website link popularity. Values are numbers (percentages) unless stated otherwise

To assess the correlation between click popularity and link popularity, we evaluated the top 10 sites returned by Direct Hit, which ranks sites based on click popularity and duration of visits. Nine of the 10 sites were in the top 200 by Google rank, eight were among the more popular sites by Google rank, and five were among the top 20 by Google rank.

Discussion

To meet the demand for health information on the web, it is important to identify the factors that influence popularity of websites. Our results show that type rather than quality of content determines popularity. To our knowledge, ours is the first study to assess the popularity as well as the quality and accuracy of health related websites.

We found that many breast cancer websites do not comply with the JAMA benchmarks, but we found higher compliance than previously reported.10,11 This may reflect an improvement in quality of websites over the past few years or a difference between search engines used in the studies.

Since accessibility and ranking of websites vary with the search engine used, the overlap between Google and other search engines of only 13-46% is not surprising. We found that “more popular” sites were more likely to be displayed by multiple search engines. If a site is not displayed, it is unlikely to be visited by users of the internet; thus the more popular sites by our measures of link popularity should indeed be the more popular sites among users of the internet. Our finding that eight of the top 10 sites according to Direct Hit were among the more popular sites by Google rank also supports this assertion.

Our results confirm those of an earlier study that found no correlation between measures of quality and link popularity.24 We may have selected higher quality sites by examining the top 200 of about 600 000 sites returned by Google. Significant differences in quality might have emerged if we had increased our sample size or compared the top 100 sites with sites of lower popularity, such as those ranked 1000-1100. Using less popular sites, however, would not have allowed us to correlate multiple measures of link popularity, as most sites would have no incoming links.

One limitation of our study is that we performed multiple comparisons. Another limitation is that a single reviewer (FM) assessed quality and accuracy. To mitigate this, we used objective criteria whenever possible. For example, we used the presence or absence of authorship information rather than author authority. Accuracy is inherently subjective, so our results should be confirmed by studies using a panel of experts. Multiple, non-expert reviewers, however, may not be better than a single expert reviewer.

In one survey, only 14% of patients expressed uncertainty about the accuracy of medical information on the web.4 We found that higher quality sites contain more accurate information. Objective measures of quality may help lay users to assess online health information.

Self regulation has been advocated as a way of maintaining the quality of online medical content. Our finding that sites displaying a Health on the Net seal did not comply with the Health on the Net code emphasises the limitation of self regulation. It remains the responsibility of the medical community to ensure adequate quality of online medical content, to educate the public regarding quality measures, and to direct patients to sites of known quality.

Link popularity, which can be assessed automatically, has been proposed as an indirect measure of quality.19 This is analogous to citation analysis, a somewhat controversial approach to measuring quality in the printed literature. Although link popularity may identify sites of interest, it does not correlate with quality of content. The growing number of users of the internet searching for health information indicates an unmet need for information. Understanding what patients are looking for on line may help us meet their need for health information.

Figure
Number of accurate and inaccurate websites, based on number of JAMA benchmarks met. A website was considered inaccurate if it contained one or more inaccurate statements

Acknowledgments

We thank Valerie Natale and Stephanie Deming for editorial assistance and Herbert Kaizer and Soumya Raychaudhuri for critical reading of the manuscript.

Footnotes

Funding: Supported in part by Grant LM06594 from the National Library of Medicine (EVB).

Competing interests: None declared.

References

1. Metz JM, Devine P, DeNittis A, Stambaugh M, Jones H, Goldwein J, et al. Utilization of the internet by oncology patients to obtain cancer related information. Proc Am Soc Clin Oncol. 2001;20:395a. (abstract 1575).
2. Yakren S, Shi W, Thaler H, Agre P, Bach PB, Schrag D, et al. Use of internet and other information resources among adult cancer patients and their companions. Proc Am Soc Clin Oncol. 2001;20:398a. (abstract 1589).
3. Helft PR, Hlubocky FJ, Gordon EJ, Ratain MJ, Daugherty CK. Hope and the media in advanced cancer patients. Proc Am Soc Clin Oncol. 2000;19:633a. (abstract 2497).
4. O'Connor JB, Johanson JF. Use of the web for medical information by a gastroenterology clinic population. JAMA. 2000;284:1962–1964. [PubMed]
5. Health on the Net. Survey on the evolution of internet use for health purposes: raw data for the survey February-March 2001. www.honch/Survey/FebMar2001/ (accessed 14 Jan 2002).
6. Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information on the internet: Caveant lector et viewor—let the reader and viewer beware [editorial] [see comments] JAMA. 1997;277:1244–1245. [PubMed]
7. Jadad AR, Gagliardi A. Rating health information on the internet: navigating to knowledge or to Babel? JAMA. 1998;279:611–614. [PubMed]
8. Bichakjian CK, Schwartz JL, Wang TS, Hall JM, Johnson TM, Sybil Biermann J. Melanoma information on the internet: often incomplete—a public health opportunity? J Clin Oncol. 2002;20:134–141. [PubMed]
9. Price SL, Hersh WR. Filtering web pages for quality indicators: an empirical approach to finding high quality consumer health information on the world wide web. Proc AMIA Symp 1999:911-5. [PMC free article] [PubMed]
10. Shon J, Musen MA. The low availability of metadata elements for evaluating the quality of medical information on the world wide web. Proc AMIA Symp 1999:945-9. [PMC free article] [PubMed]
11. Hoffman-Goetz L, Clarke JN. Quality of breast cancer sites on the world wide web. Can J Public Health. 2000;91:281–284. [PubMed]
12. What is popularity: thewritemarket.com. www.thewritemarket.com/search/popularity.htm (accessed 14 Mar 2001).
13. Kerber R. Direct hit uses popularity to narrow internet searches. Wall Street Journal. 1998;232(July 2):B4. , Op 1.
14. Search engines take a quantum leap: 19 out of 20 now use link popularity to determine relevancy. www.webseed.com/page1007.html (accessed 14 Mar 2001).
15. Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems. 1998;30:107–117.
16. Linkpopularity.com. www.linkpopularity.com/ (accessed 21 Dec 2001).
17. Google. http://google.com/ (accessed 14 Mar 2001).
18. Rumsey E. Peer-review popularity vs. dotcom popularity. www.lib.uiowa.edu (accessed 7 Feb 2002).
19. Eysenbach G, Diepgen TL. Towards quality management of medical information on the internet: evaluation, labelling, and filtering of information. BMJ. 1998;317:1496–1500. [PMC free article] [PubMed]
20. Lacroix E-M. Health topics most hit March 2000. www.nlm.nih.gov/pubs/staffpubs/lo/medlineplus/sld013.htm (accessed 27 Jan 2001).
21. Bateman M, Rittenberg CN, Gralla RJ. Is the internet a reliable and useful resource for patients and oncology professionals: a randomized evaluation of breast cancer information. Proc Am Soc Clin Oncol. 1998;17:419a. (abstract 1616).
22. Berland GK, Elliott MN, Morales LS, Algazy JI, Kravitz RL, Broder MS, et al. Health information on the internet: accessibility, quality, and readability in English and Spanish. JAMA. 2001;285:2612–2621. [PMC free article] [PubMed]
23. Health On the Net Foundation. HON code of conduct (HONcode) for medical and health web sites: principles. www.hon.ch/HONcode/Conduct.html (accessed 25 May 2001).
24. Sandvik H. Health information and interaction on the internet: a survey of female urinary incontinence. BMJ. 1999;319:29–32. [PMC free article] [PubMed]

Articles from BMJ : British Medical Journal are provided here courtesy of BMJ Group
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...