Background and Context

Observational data have suggested a strong association between a number of dietary factors and chronic disease risk1–5. However, randomized controlled trials (RCTs) designed to assess the efficacy of these dietary factors with respect to health outcomes have yielded, for the most part, negative results (fiber and colon cancer6, vitamin E and cardiovascular disease7, vitamin E and lung cancer8, β-carotene and cancer8, β-carotene and coronary events9, vitamin C and cardiovascular disease10, and folate and cardiovascular disease11).

The outcomes of these trials were both disappointing to the health care community and confusing to the general public. The trials were expensive to conduct, and in some cases, they identified adverse effects of nutrient supplementation8. The discrepancies raised serious questions about the currently used approach for determining whether the evidence base is adequate to justify launching a large-scale RCT, with hard endpoints as the outcome measure.

Deciding which specific nutrient–disease association to further evaluate in human intervention trials is challenging. Apart from the expected impact of a nutritional intervention on public health, and the feasibility and logistics of conducting a trial, additional critical components need to be factored into the decision. These pertain to the maturity and reliability of the relevant evidence base—that is, the strength of the data supporting a potential nutrient– disease association, the biological plausibility of the association, the reliability of existing data, and the likelihood of bias and systematic errors affecting the interpretation of the available data. The evidence base is formed by the interplay of various translational paths, in which an initial hypothesis-forming observation supports subsequent research and is eventually “translated” to interventions for preventing or treating human disease. It is possible that nutrient associations where RCT and observational data are concordant have a more extensive and mature evidence base, compared with associations in which the data are discordant. Therefore, further understanding of the translational paths that shape the evidence base in each topic is of interest. Figure 1 describes alternative scenarios for the possible cascade of translational events. The simplistic model in Figure 1a purports a linear progression from an initial experiment in the laboratory to a succession of research studies that eventually lead to observational studies and then to RCTs in humans. If anything, anecdotal observations suggest a much more circuitous path, such as that in Figure 1b, in which there is no clear succession of studies types. To some extent, such patterns of information flow can be assessed with citation analysis, which is a qualitative and quantitative representation of citation relationships among publications.

Shown are two hypothetical translational paths from a seminal observation to intervention studies in humans. The two paths are depicted as citation graphs comprising circles and arcs that connect the circles. Circles stand for articles and arcs denote citation relationships between articles. Arcs start from the cited paper and point to the cited paper. Circles are also color coded according to “study type,” namely in vitro studies, animal studies, observational studies in humans, and randomized controlled trials in humans.  On the left panel a hypothetical first (seminal) observation starts from an in vitro study in the lab (lowest left white node), and eventually gets translated to RCT in humans through a succession of research in animal disease models and epidemiological studies in humans. Each design builds on the findings of the previous one, and there is a temporal succession: in particular RCT are performed after the observational studies, as they are motivated, designed and launched based on observational data. The right panel illustrates a much more circuitous path. The first observation about a nutrient-disease association is made in humans. This spurs a complex network of interrelated research activity. Related hypotheses are tested in subsequent explorations using observational data, studies in animal models, and in RCTs.

Figure 1

Simplistic (a) and more complex (b) hypothetical translational paths connecting a seminal observation to eventual RCTs in humans. T = time; RCT = randomized controlled trial Shown are hypothetical alternative translational paths from a seminal observation (more...)

We hypothesize that differences in the observed flow of information (as captured by citations that are received or made among publications) through the various translational paths and the content of the propagated information are associated with concordance or discordance in the results of observational studies and RCTs. For example, a limited evidence base and information flow may indicate inconsistency of study results and thus may be associated with topics where RCTs and observational studies disagree. Reciprocally, a large evidence base with higher information flow may indicate consistency of findings and general agreement between RCTs and observational studies. Of course, these are not one-to-one relationships; it is conceivable that profound inconsistencies and disagreements between studies could lead to considerable discourse among investigators which in turn would increase information flow. We set out to empirically explore our hypothesis by analyzing and comparing characteristics of the citation networks in two nutritional associations with disease: one where the two research designs generally agree and one where they disagree.