NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Viswanathan M, Siega-Riz AM, Moos MK, et al. Outcomes of Maternal Weight Gain. Rockville (MD): Agency for Healthcare Research and Quality (US); 2008 May. (Evidence Reports/Technology Assessments, No. 168.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Outcomes of Maternal Weight Gain

Outcomes of Maternal Weight Gain.

Show details


In this chapter, we document the procedures that the RTI International-University of North Carolina Evidence-based Practice Center (RTI-UNC EPC) used to develop this comprehensive evidence report on outcomes of maternal weight gain. The team was led by a senior health services researcher (Meera Viswanathan, PhD, Study Director), a senior epidemiologist (Anna Maria Siega-Riz, PhD, RD, Scientific Director), and a senior nurse-researcher (Merry-K Moos, FNP, MPH, co-Scientific Director).

We first describe our strategy for identifying articles relevant to our five key questions (KQs), our inclusion and exclusion criteria, and the process we used to abstract relevant information from the eligible articles and generate our evidence tables. We also discuss our criteria for grading the quality of individual articles and for rating the strength of the evidence as a whole. Finally, we explain the peer-review process.

Literature Review Methods

Inclusion and Exclusion Criteria

Our inclusion and exclusion criteria are documented in Table 1. As noted in Chapter 1, this systematic review focuses on outcomes of maternal weight gain with respect to the 1990 recommendations from the Institute of Medicine (IOM).1 Largely for that reason, we limited our searches to articles published in 1990 and thereafter. We also restricted our searches to developed countries so that we could have data generally relevant for maternal weight gain and health outcomes in the United States.

Table 1. Inclusion/exclusion criteria for gestational weight gain.

Table 1

Inclusion/exclusion criteria for gestational weight gain.

We excluded studies that (1) were published in languages other than English (given the available time and resources); (2) did not report information pertinent to the key clinical questions; (3) had fewer than 40 subjects for randomized controlled trials (RCTs) or nonrandomized cohorts with comparisons or fewer than 100 subjects for case series; and (4) were not original studies.

For KQ 1, 2, 3, and 4, we required that the reported association between maternal weight gain and health outcomes accounted for prepregnancy body mass index (BMI) or weight, either through stratified univariate analysis or multivariate analysis.

Literature Search and Retrieval Process

Databases. We used multifaceted search strategies to include current and valid research on the KQs, which we applied to four standard electronic databases—MEDLINE®, Cochrane Collaboration resources, Cumulative Index to Nursing and Allied Health Literature (CINAHL), and Embase. We also hand-searched the reference lists of relevant articles to make sure that we did not miss any relevant studies. We consulted with our Technical Expert Panel (TEP) about any studies or trials that are currently under way or that may not yet be published.

Search terms. Based on the inclusion/exclusion criteria above, we generated a list of Medical Subject Heading (MeSH) search terms (Table 2 and Appendix A).* Our TEP also reviewed these terms to ensure that we were not missing any critical areas, and this list represents our collective decisions as to the MeSH terms used for all searches.

Table 2. MEDLINE® search strategy and unduplicated results for February 2007.

Table 2

MEDLINE® search strategy and unduplicated results for February 2007.

Our searches in MEDLINE® produced 715 unduplicated records. Searches in other databases yielded in 190 new records from CINAHL and 4 from Embase. Similar searches in Cochrane did not produce any new citations. Following an update on October 3, 2007, and additional searches for KQ 5, we ultimately identified 1,082 unduplicated records. In addition, peer reviews suggested 3 new citations that met our inclusion criteria.

Figure 2 presents the yield and results from our searches, which we conducted from February through October 3, 2007. Beginning with a yield of 1,085 articles, we retained 150 articles that we determined were relevant to address our KQs and met our inclusion/exclusion criteria (Table 1). We reviewed titles and abstracts of the articles against the basic inclusion criteria above; we retained relevant articles, all published after our search cutoff date of January 1990, and used them as appropriate in the discussion in Chapter 4.

Figure 2. Disposition of articles for gestational weight gain.


Figure 2. Disposition of articles for gestational weight gain. Abbreviation: KQ: key question.

Article selection process. Once we had identified articles through the electronic database searches, review articles, and reference lists, we examined abstracts of articles to determine whether studies met our criteria. Each abstract was independently, dually reviewed for inclusion or exclusion, using an Abstract Review Form (Appendix B).* If one reviewer concluded that the article should be included in the review, we retained it.

Of this entire group of 1,085 articles, 479 required full review. For the full article review, one team member read each article and decided whether it met our inclusion criteria, using a Full Text Inclusion/Exclusion Form (Appendix B). Reasons for article exclusion are listed in Appendix D.

Literature Synthesis

Development of Evidence Tables and Data Abstraction Process

The senior staff who conducted this systematic review jointly developed the evidence tables. We designed the tables to provide sufficient information to enable readers to understand the studies and to determine their quality; we gave particular emphasis to essential information related to our KQs. We based the format of our evidence tables on successful designs that we have used for prior systematic reviews.

We trained abstractors by having them abstract several articles into evidence tables and then reconvening as a group to discuss the utility of the table design. The abstractors repeated this process through several iterations until they decided that the tables included the appropriate categories for gathering the information contained in the articles.

Three junior epidemiologists (Sunni Mumford, SM; Andrea Deierlein, MS, MPH; and Julie K. Knaack, MPH, RD, LDN) shared the task of initially entering information into the evidence tables. Senior staff reviewed the articles and edited all initial table entries for accuracy, completeness, and consistency. Abstractors reconciled all disagreements concerning the information reported in the evidence tables. The full research team met regularly during the article abstraction period and discussed global issues related to the data abstraction process.

The final evidence tables are presented in their entirety in Appendix C. Studies are presented in the evidence tables alphabetically by the last name of the first author. A list of abbreviations and acronyms used in the tables appears at the beginning of that appendix.

Quality Rating of Individual Studies

The evidence for this systematic review is based almost entirely on observational studies. This fact presents a challenge for rating individual studies. Quality rating forms for RCTs have been validated and in use for several years; a similarly well-validated form for observational studies does not exist.

Thus, as a parallel effort, we developed a form to rate observational studies.35 This form, which can be used to rate the quality of a variety of observational studies, was based on a review of more than 90 AHRQ systematic reviews that included observational studies; we supplemented this review with other key articles identifying domains and scales.36,37 We structured the resultant form largely on the basis of the domains and subdomains suggested by Deeks and colleagues;36 we then adapted it for use in this systematic review (Appendix B).*

The form currently includes review of nine key domains: background, sample selection, specification of exposure, specification of outcome, soundness of information, followup, analysis comparability, analysis of outcome, and interpretation. Each of these domains was further evaluated on aspects of quality of the study design or reporting that would influence the reader's perception of internal validity of the journal article (Table 3). We note that variations in reporting could result in different scores for studies drawing from the same sample.

Table 3. Scoring algorithm for subdomains and overall quality rating for individual studies.

Table 3

Scoring algorithm for subdomains and overall quality rating for individual studies.

As described in Table 3, we combined these elements to generate overall scores. We set the default as fair and then focused on the threshold required for good and poor studies; the algorithm is also described in Table 3. Fair studies, therefore, include studies that were predominantly fair (four to nine fair ratings on domains) and could not be rated either good (fewer than five good ratings for subdomains) or poor (fewer than three poor ratings for subdomains). Studies with more than five good ratings for domains that also received one or two poor ratings were downgraded to fair quality.

Key methodological concerns in this literature relate to the source of information on weight gain and the timing of measurement of weight gain. Studies that relied solely on self-reported pregravid and final pregnancy weights suffer from well-documented issues of recall bias. In addition, women tend to misreport their weight, and this bias varies by weight status38 and ethnicity.39 The timing of weight measurement (for pregravid weight and final weight) can vary depending on the design of the study; when unreported, the total weight gain during pregnancy cannot be assumed to be collected at similar time points for all women within the study, resulting in further bias. Our rating algorithm, therefore, paid special attention to the source of data on gestational weight gain and the timing of measurement. Studies that relied solely on recalled prepregnancy and total pregnancy weight were rated poor on that domain, but if they defined their gestational weight variable clearly (providing details on the timing of measurement for pregravid and final weight measurements) and either checked for the biological plausibility of pregravid weight status or explained how outliers were dealt with, they could receive an overall fair rating (assuming that they received fewer than three poor ratings overall).

Strength of Available Evidence

Our scheme follows the criteria applied in an earlier RTI-UNC EPC systematic review of systems for rating the strength of a body of evidence.40 That system has three domains: quality of the research (as evaluated by the quality rating algorithm described above), quantity of studies (including number of studies and adequacy of the sample size), and consistency of findings. Two senior staff members assigned grades by consensus.

We graded the body of literature for each KQ and present those ratings as part of the discussion in Chapter 4. The possible grades in our scheme are as follows:


Strong: The evidence is from studies of sound design (good quality); results are both clinically important and consistent with minor exceptions at most; results are free from serious doubts about generalizability, bias, or flaws in research design. Studies with negative results have sufficiently large samples to have adequate statistical power.


Moderate: The evidence is from studies of sound design (good quality), but some uncertainty remains because of inconsistencies or concern about generalizability, bias, research design flaws, or adequate sample size. Alternatively, the evidence is consistent but derives from studies of weaker design (fair quality).


Weak: The evidence is from a limited number of studies of weaker design (fair or poor quality). Studies with strong design (good quality) either have not been done or are inconclusive.


No evidence: No published literature.

External Peer Review

As is customary for all evidence reports and systematic reviews done for AHRQ, the RTI-UNC EPC requested review of this report from a wide array of individual outside experts in the field, including our TEP, and from relevant professional societies and public organizations. AHRQ also requested review from its own staff. We sent 20 invitations for peer review: 6 TEP members, 6 relevant organizations, and 8 individual experts. Reviewers included clinicians (e.g., obstetrics and gynecology, women's health/general health), representatives of federal agencies, advocacy groups, and potential users of the report.

We charged peer reviewers with commenting on the content, structure, and format of the evidence report, providing additional relevant citations, and pointing out issues related to how we had conceptualized and defined the topic and KQs. We also asked them to complete a peer review checklist. We received comments from 11 of the invited peer reviewers in addition to comments from AHRQ staff. The individuals listed in Appendix E* gave us permission to acknowledge their review of the draft. We compiled all comments and addressed each one individually, revising the text as appropriate.



Appendixes and evidence tables cited in this report are provided electronically at http://www​​/pub/evidence​/pdf/admaternal/admaternalapp.pdf.


  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...