U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Myers ER, McCrory DC, Mills AA, et al. Effectiveness of Assisted Reproductive Technology. Rockville (MD): Agency for Healthcare Research and Quality (US); 2008 May. (Evidence Reports/Technology Assessments, No. 167.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Effectiveness of Assisted Reproductive Technology

Effectiveness of Assisted Reproductive Technology.

Show details

6Conclusions

Ovulation Induction without Assisted Conception (Question 2)

I. General Issues

Despite screening 181 full-text articles for eligibility, we are limited in our ability to draw conclusions about most of the topics discussed under Question 2. Several methodologic issues were consistently seen in our review.

First, there were relatively few randomized trials compared to the overall volume of literature. Although this is obviously a problem not limited to studies of ovulation induction, or reproductive medicine in general, there are several unique barriers to conducting appropriately designed studies in this field; these barriers are discussed in detail in the “Future Research” chapter, above.

Second, the majority of the studies do not provide data on live birth rates or other obstetric outcomes. Although there is ongoing debate about the most appropriate primary outcome for studies in infertility,539 live birth per couple is widely considered both methodologically and clinically appropriate and important. Although surrogate outcomes such as ovulation and pregnancy may require smaller sample sizes or shorter duration trials, the intuitively appealing link between surrogates and the ultimate outcome of live birth is not always borne out when ultimately tested.548 For example, increased ovulation rates with metformin compared to clomiphene have been observed in some randomized trials, but as discussed in the Results chapter, do not translate into increased live birth rates.

Second, the size of individual studies was almost universally too small to detect clinically important differences in pregnancy and live birth rates. Given that live birth is a dichotomous outcome, large sample sizes will be necessary; the largest study, the PPCOS study, enrolled over 200 women per arm to establish a 15 percent absolute difference in live birth rates. There does not appear to be consensus on what should be the minimal clinically important difference; given that there are frequently tradeoffs between live birth rate and the risk of multiple gestation or other complications, this difference may vary with different treatments in different patient populations. Again, this should be a high priority for future research, one which should ideally involve clinicians, policymakers, and patients, using rigorous methods for estimating preferences for different outcomes. One of the few studies to use standard methods for quantifying patient preferences found that women were willing to take on an increased risk of short-term complications and multiples in order to increase their absolute live birth rate by 5 percent,549 a difference which would require very large (> 1000 subjects) trials to determine.

A corollary of the sample size issue is that studies which do have sufficient power to detect differences in live birth rates are highly unlikely to have the power to detect clinically important differences in less common outcomes such as multiple gestation, pregnancy complications, and short-term complications of treatment such as OHSS. As others have pointed out,36 the lack of a statistically significant difference in an outcome is not the same as demonstration of equivalence, especially given that the confidence intervals for these less common outcomes is almost always quite wide. Studies specifically designed and powered to detect differences in other important clinical outcomes, or greater consensus on study design issues to reduce heterogeneity and improve the precision and reliability of meta-analytic methods, are needed.

One strength of the literature on ovulation induction and superovulation is that the majority of trials, especially more recent trials,550 involve randomization to a treatment arm and continued treatment on that arm for a specified period of time. This is important from both a statistical36 and clinical viewpoint, since most treatments are continued for several cycles. One goal of protocol design in clinical trials is to reflect clinical practice as much as possible. Study designs that randomize couples to a single treatment cycle of a treatment strategy generally do not reflect typical practice and may miss differences in cumulative rates of outcomes that are not detectable after a single cycle.

II. Ovulation Induction in Anovulatory Women

Based on our review, there are several aspects of interventions for ovulation induction in women with PCOS for which there is either strong evidence, promising evidence from single studies worth confirming with additional trials, or evidence of short-term benefit needing confirmation of long-term safety.

Clomiphene is an effective first-line therapy for women with PCOS. Metformin is, at best, no more effective, and, based on a large multi-center trial, less effective than clomiphene alone. Potential explanations for the disparity between the findings of the two randomized trials published to date, such as genetic variability in responses to the different agents, are worth further investigation. The effect of both drugs on spontaneous abortion rates should be investigated in properly designed trials.

Although a statistically significant effect is not observed in individual studies, meta-analyses do demonstrate a significant increase in pregnancy rates in clomiphene-resistant women treated with metformin. Whether these results translate into improved live birth rates should be confirmed in larger studies, although the lower overall birth rate in this population will require large studies.

Pre-treatment with oral contraceptives, co-treatment with n-acetyl-cysteine, and co-treatment with dexamethasone all resulted in large and statistically significant increases in pregnancy rates in combination with clomiphene in clomiphene-resistant anovulatory women, along with increased multiple gestation rates. These findings warrant further investigation, particularly if multiple gestations can be avoided.

Use of laparoscopic cautery, followed by ovulation induction if necessary, results in similar pregnancy and live birth rates, with significantly lower multiple gestation rates, compared to immediate gonadotropin use in clomiphene-resistant women. The addition of metformin may result in further improvements in pregnancy and live birth rates. There are no data on the long-term sequelae of laparoscopic ovarian cautery, and long-term followup studies to assess the risk of pelvic adhesions, premature ovarian failure, or early menopause are warranted.

III. Superovulation in Ovulatory Women

The available literature does not allow any conclusions about the relative efficacy of different estrogen inhibitors, although 5 mg of letrozole appears to be superior to 2.5 mg. Pooled data shows significantly higher pregnancy rates with gonadotropins compared to estrogen inhibitors, but data are too limited to draw conclusions about live birth rates. There is a trend towards higher rates of multiple pregnancy and OHSS with gonadotropins compared to estrogen inhibitors, but the number of events, even in pooled studies, prevents definite conclusions.

There do not appear to be substantial differences in pregnancy rates between different gonadotropin preparations. Higher doses increase the risk of multiples and OHSS without significant improvement in pregnancy rates. The addition of GnRH antagonists to superovulation protocols may increase both pregnancy rates and twin gestation rates. Further studies adequately powered for the outcome of live birth per couple are needed.

Hysteroscopic resection of endometrial polyps noted on ultrasound prior to IUI increases pregnancy rates.

Assisted Conception: IVF and ICSI (Question 3)

I. General Issues

There are several consistent issues with the majority of studies reviewed for Question 3, many of which are shared with trials of ovulation induction and superovulation and most of which have been identified by other authors,36,538,550 including variation in definition of endpoints, especially related to pregnancy, lack of concealment of treatment allocation, and lack of blinding where it is feasible. Three issues deserve particular attention.

Sample size is a recurrent problem. Very few of the studies reviewed for this Question had a priori sample sizes for pregnancy or live birth - most used surrogate markers, such as number of oocytes retrieved in a given cycle. Given a baseline live birth rate per cycle of IVF in the United States of 34 percent,10 an alpha of 0.05, and a power of 80 percent, approximately 1100 subjects would be needed per arm to demonstrate a 5 percent absolute improvement in live birth rates, 320 to show a difference of 10 percent, and 135 to show a difference of 15 percent. Only two of the 237 articles included under Question 3 had more than 300 subjects per arm. On the other hand, failure to detect a significant difference is not the same as demonstrating equivalence or non-inferiority - equivalence studies generally are designed so that the lower 95 percent bound of the new intervention is within some pre-specified level, and, as a rule, require more subjects than superiority studies. For example, if the point estimates for live birth rates of two different arms in a study were 34 percent and 39 percent, a sample size of 1200 subjects per arm would be required to conclude that the second intervention was no more than 5 percent worse than the first; 390 subjects per arm would be required to conclude that there was no more than a 10 percent difference. Very few of the studies we identified had adequate power to declare equivalence or non-inferiority. Even one of the largest studies, a trial of double embryo transfer versus single embryo transfer followed by frozen-thawed transfer with 330 subjects per arm,365 which was explicitly designed and powered as an equivalence study, was unable to demonstrate that the lower bound of the difference between the two interventions was not more than 10 percent.

A second, related issue is the inferences frequently drawn by study authors about relative safety. If almost none of the studies had the power to detect an absolute difference of 10 percent (or, at a baseline of 34 percent, a relative risk of 1.29) for a live birth outcome, the power to detect differences in outcomes that are a fraction of live births, such as multiple pregnancies or complications such as OHSS, is even lower. For the most part, it is almost impossible to estimate relative safety based on single trials.

Another issue relates to the duration of the intervention. The vast majority of the studies reviewed randomized subjects to only a single cycle of the interventions being investigated. Although this facilitates translating results most frequently reported on a per-cycle basis to a per-subject basis, it may not reflect the clinical scenario likely to be most relevant. If an intervention would be used clinically in subsequent cycles if a pregnancy does not result, then, ideally, the intervention should be continued in the same couple for some pre-specified amount of time or number of cycles in trials of that intervention. Alternatively, if embryos are cryopreserved for use in subsequent cycles, the results of those frozen-thawed transfers should be included in the reported cumulative rates. Cumulative results were much more common in studies of ovulation induction compared to IVF.

II. The Female Partner

A. Methods for down-regulation. Despite the issues described immediately above, there is reasonable evidence regarding certain aspects of IVF/ICSI.

We did not identify clear evidence of the superiority of any specific protocol involving GnRH agonists. In the setting of endometrial preparation for frozen-thawed embryo transfer, two relatively large studies had conflicting results regarding the benefit of adding an agonist; further research is needed.

Although only one individual study comparing GnRH agonists to antagonists found a significant difference in pregnancy or live birth rates (in favor of agonists), formal meta-analysis shows a significantly lower pregnancy and live birth rate with the use of antagonists; antagonists do result in significant decreases in gonadotropin requirements, and a significant decrease in the risk of OHSS.

Pretreatment with an oral contraceptive to assist with scheduling GnRH antagonist cycles resulted in decreases in pregnancy rates in all three identified studies; this reduction was statistically significant in one.

B. Methods for ovarian stimulation. Again, most individual studies were underpowered. Pooled results of individual trials suggest that hMG is superior to rFSH in long protocol GnRH agonist regimens, with higher multiple pregnancy rates, and that the addition of rLH to rFSH improves live birth rates in poor responders. Based on differences in the amount of gonadotropin required, there may be economic advantages to some formulations, but formal economic evaluations ultimately will require more precise estimates of effect.

C. Methods to trigger oocyte maturation. Timing of hCG administration for follicular maturation is important for optimizing live birth rates - delays of 48 hours after one ultrasound threshold (at least 3 follicles of at least 17 mm) resulted in significant decreases in live births. The optimal timing and threshold have not been determined. There does not appear to be any difference in pregnancy or live birth rates, or other major outcomes, between rhCG and uhCG, although injection site reactions are more common with uhCG. In cycles using a GnRH antagonist for pituitary down-regulation, use of hCG is superior to use of a GnRH agonist.

D. Methods for oocyte retrieval. Choice of analgesia for oocyte retrieval does not appear to affect pregnancy rates. Variability in outcome measures makes between-study comparisons difficult regarding specific techniques. Techniques involving some form of sedation result in lower intraoperative pain, but this does not appear to adversely affect overall patient perceptions and satisfaction.

E. Methods for endometrial preparation for frozen-thawed embryo transfer. There is insufficient evidence to determine the optimal method for endometrial preparation for frozen-thawed embryo transfer.

F. Methods for embryo transfer. Pre-transfer irrigation does not improve pregnancy or live birth rate and, based on an intent-to-treat analysis of the one study identified, significantly reduces both rates. There is no evidence that type of provider changes outcomes. Although pre-treatment with antibiotics significantly lowers measurable bacterial contamination, this does not translate into improved pregnancy or live birth rates.

Ultrasound-guided embryo transfer consistently results in substantially improved (40 percent relative increase) pregnancy and live birth rates compared to various “clinical touch” methods. The consistency of this finding and the size of the effect are striking considering that the majority of interventions evaluated in this review do not show significant differences.

G. Methods for luteal support. Some form of luteal support is necessary with IVF, since both progesterone and hCG result in improved pregnancy rates compared to no treatment. Although there is no detectable difference between oral progesterone and the various formulations of vaginal progesterone, both result in lower pregnancy and live birth rates compared to intramuscular progesterone. The addition of estrogen to progesterone may improve outcomes, although additional larger studies are needed to confirm these findings. Finally, adding stimulation with a GnRH agonist to progesterone and estrogen in patients down-regulated with a GnRH antagonist improves live birth rates.

H. Other adjuncts. Based on the available evidence, vasoactive agents such as nitroglycerin, beta-agonists, or l-arginine do not improve pregnancy or live birth rates in either first-time or poor prognosis IVF patients. Low-dose aspirin also does not appear to have any effect. The NSAID piroxicam significantly improved pregnancy and live birth rates in a general IVF population, and further studies of NSAIDs are warranted. Randomized trials of intercessory prayer and acupuncture showed benefit, but there are remaining methodological questions which need to be addressed.

Dexamethasone and growth hormone both improved pregnancy and live births in women over 40 undergoing IVF; the growth hormone findings are consistent with earlier studies showing a benefit in poor responders. Metformin reduced the incidence of OHSS and showed evidence of improvement in pregnancy and live birth rates in women with PCOS undergoing IVF. In women with endometriosis, pre-ART surgical management does not improve outcomes, but pretreatment with a GnRH agonist for several months prior to IVF improves pregnancy and live birth rates. Other surgical interventions shown to improve outcomes are hysteroscopic removal of endometrial lesions and surgical removal or occlusion of hydrosalpinges.

I. Methods for prevention of OHSS. One study published since the most recent Cochrane review found no benefit for intravenous albumin in preventing OHSS, in contrast to previous studies and the Cochrane review. This may be due to the low event rate observed in this study.

III. The Embryo

A. Methods for fertilization. IVF results in much higher birth rates within 90 days than watchful waiting in eligible patients, although cumulative pregnancy rates were similar in one trial comparing IVF to IUI and stimulated IUI. There is no evidence of benefit for ICSI compared to IVF in patients with non-male factor infertility. Technical aspects of the fertilization procedure, such as media and equipment used, may have significant impact on outcomes.

B. Culture methods. There is insufficient evidence to draw any inferences regarding the effect of culture media on pregnancy or live birth

C. Methods for selection. The addition of a zygote cleavage score to embryo quality scoring based on morphology did not result in improved pregnancy or live birth rates. Preimplantation genetic screening resulted in lower overall pregnancy and live birth rates in women 37 and older.

D. Preparation for transfer. Assisted hatching improves pregnancy and live birth rates in couples with previous IVF failure, but there is insufficient evidence to draw inferences about benefits in other groups.

E. Timing of transfer. The available evidence suggests that zygote transfer is, at best, no better than day 3 transfer and may result in worse pregnancy and live birth rates. Transfer on day 2 may produce better outcomes compared to day 3 in women with poor ovarian response, based on one large trial; pooled meta-analysis results suggest better pregnancy rates, but not necessarily live birth rates, in cycles where ICSI is used. Finally, blastocyst transfer results in better live birth rates than day 3 transfer, especially in patients with a good prognosis. The disadvantage of delaying transfer is a reduction in the number of embryos available for transfer and for cryopreservation, and the increased risk of monozygotic twinning.551

F. Number to transfer. Although double embryo transfer results in higher pregnancy and live birth rates compared to single embryo transfer, multiple rates - almost all twins - are consistently higher. Strategies involving alternative methods for pituitary down-regulation, or involving multiple cycles with fewer embryo transfers per cycle, appear to result in similar live birth rates with fewer multiples.

Longer Term Outcomes (Question 4)

I. General Issues

Our review of the current evidence on fetal and maternal outcome raises several important issues which need to be considered in interpreting the existing literature, and in planning future research.

A. Study design. First, although we found several consistent associations that should be considered by patients, clinicians, and policymakers in making decisions about various aspects of infertility, it is important to remember that the overwhelming majority of the literature consists of observational studies. The most common design was a modified cohort study, where all of the women exposed to a particular treatment were compared to a sample, either random or matched for known confounders, and the incidence of the outcomes compared. We also identified several population-based cohort studies, where all infertility patients were compared to all other pregnant women and their infants in a given geographic area. Case-control studies, in which all of the subjects with a given outcome are selected along with a matched or unmatched sample of subjects without the outcome, were much less common, and were, appropriately, primarily used for less common outcomes, such as cancer and specific congenital abnormalities. Although these study designs are valid and well-established tools for epidemiologic research, it is important to remember the strong potential for unmeasured confounding, especially when examining the association between a clinical treatment and the outcomes of interest. All of the reasons for using caution when interpreting the results of observational studies reporting clinical benefits apply to observational studies of adverse outcomes. Ideally, data from randomized trials would be used, but, given the relative rarity of many important outcomes relative to the number of women treated or number of children, and the consistently small sample size chosen for most randomized trials in this field, pooling of data is likely to be required.

B. Appropriate controls. For many of the outcomes discussed under this Question, any association between a specific treatment and that outcome could be either a true causal association, or an association between the underlying reason for the treatment and the subsequent outcome. In many cases, associations that were significant when infertility patients were compared to the general population weakened quantitatively when other infertility patients, or women with a prolonged time to conception, were used as controls. Although identifying such women may be difficult in many situations, failure to consider the appropriateness of the control group could easily lead to misinterpretation of study results.

C. The “moving target.” In a field where there are few barriers to rapid change in practice, it is highly likely that in many cases even the best study of a long-term outcome may have little benefit for current clinical practice. This is certainly true of outcomes likely to occur 10 or more years after treatment, such as cancers, but may well be true of shorter time intervals as well. Changes in indications, in the types of patients considered appropriate or inappropriate for a given treatment, and changes in aspects of the treatment itself that might affect these outcomes can render results irrelevant for current patients. For outcomes such as cancer, information can still be helpful if it helps target preventive efforts; however, for many shorter-term outcomes, particular those related to pregnancy and early childhood, even very strong and consistent associations may be due to factors which are no longer present.

D. Generalizability to the United States. The majority of studies we identified were performed outside the United States. The extent to which differences among infertility patients in factors such as race/ethnicity, socioeconomic status, and education affect observed associations is unclear.

With these caveats, we will summarize the results of the review for this Question.

II. Short-term Fetal Outcomes

A. Spontaneous abortion. Spontaneous abortion, defined as loss of the entire pregnancy (rather than loss of one or more fetuses with survival of at least one fetus), does not appear to be more common after assisted reproduction after adjusting for known risks; observed differences between different methods appear to be related to differences in the patient population to which the methods are applied. Loss of the entire pregnancy is less common for twins than for singletons after multiple embryo transfer; this is the first of many outcomes we reviewed where the relative risk estimate for a given outcome was consistently higher when the comparison was between IVF singletons and spontaneous singletons, rather than IVF twins and spontaneous twins.

B. Ectopic pregnancy. Similarly, although ectopic pregnancy is more common after assisted reproduction than after spontaneous conception, and variations are observed between different methods of ART, most of the difference in risk appears to be related to factors related to the mother and/or embryo rather than specific procedures.

C. Maternal screening for fetal chromosomal abnormalities. The best available evidence suggests that false positive results for maternal testing for chromosomal abnormalities after assisted reproduction are more likely for second trimester serum screening, resulting in an increased false positive rate with combined screening strategies that incorporate both modalities. Although some of this increased risk appears to be due to differences in the distribution of maternal age, the risk remained elevated in one large study even after adjustment. Further research is needed to determine the clinical implications of this finding.

D. Preterm delivery. Preterm delivery is approximately twice as likely in women pregnant with singleton pregnancies after infertility treatment compared to spontaneous singleton pregnancies. The evidence is most consistent for ART, but the risk was similar in a large study of women pregnant after ovulation induction alone. The proportion of these deliveries that is due to early delivery indicated by maternal or fetal complications versus spontaneous preterm delivery is unclear, as is the potential benefit of preventive strategies such as progesterone in this population. Conversely, in the majority of studies, the risk of preterm birth in IVF twins compared to spontaneous twins is either not elevated, or elevated to a lesser degree compared to the risk seen in ART singletons compared to spontaneous singletons. However, even though the relative risk of preterm delivery is lower for ART twins compared to spontaneous twins, the higher baseline risk for preterm delivery among twins means that the absolute number of preterm twin deliveries in ART pregnancies is large.

E. Low birth weight. Much of the elevated risk of low birth weight is due to the increased risk of preterm birth. However, studies that examined gestational age-specific weights found an increased risk of small-for-gestational age infants among singleton, but not twin, pregnancies after infertility treatment.

III. Maternal Pregnancy Outcomes

Women pregnant after infertility treatment are at increased risk for disorders potentially related to abnormal implantation, including preeclampsia, placenta previa, and placental abruption. The extent to which specific treatments or underlying maternal/embryonic characteristics contribute to this risk is unclear. Gestational diabetes risk may also be increased, although this association is weaker and less consistent. Finally, although data on psychological outcomes during pregnancy are quite limited, the data that are available suggest that women pregnant after infertility treatment have outcomes as good as, and perhaps better than, women pregnant after spontaneous conception.

The consistent association between fetal and maternal outcomes related to implantation is biologically plausible and is a promising area for future research.

IV. Infant Outcomes - Birth to 1 Year

A. Congenital anomalies. Risks for major congenital anomalies are increased after infertility treatment, but much of this risk appears to be related to maternal and/or paternal characteristics, including a history of subfertility or infertility. Given the relative rarity of specific birth defects or syndromes, identifying an association between a specific exposure and subsequent risk is difficult.

B. Other outcomes. In the neonatal period, although there is evidence of an increased risk for adverse outcomes (including cerebral palsy), especially among singletons, it is unclear to what the extent this is due to the observed increased preterm delivery rate after ART (a major risk factor for many adverse outcomes), or is instead independently associated with infertility and/or infertility treatment. Large-scale studies that control for gestational age and birth weight are needed. In later infancy, there is a significantly increased hospitalization rate among children born after IVF/ICSI compared to the general population, but rates are similar when compared to children born to couples with a history of treated and untreated subfertility.

V. Child Outcomes - Beyond 1 Year

A. Physical outcomes. Children born after assisted reproduction have an increased risk of hospitalization and surgery compared to general population controls. At least some of this risk is likely related to the underlying condition causing infertility, rather than to the treatment itself. Finally, there does not appear to be an increased risk of childhood cancers in children of women who received infertility treatments.

B. Neurodevelopmental outcomes. The available evidence suggests that there is not an increase in the risk of adverse neurodevelopmental outcomes in children born after infertility treatment that is not associated with the underlying condition of infertility or the well-established increased risk of prematurity and SGA. The available evidence on learning and other developmental outcomes is reassuring, but larger studies across a wider population are needed.

VI. Maternal Long-Term Outcomes

A. Cancers. In general, infertility treatments involving ovarian stimulation do not appear to be associated with an increased risk of breast cancer, although non-significantly elevated risks were seen 20 years after exposure in one study, suggesting that continued monitoring is warranted. Ovarian cancers are even more strongly associated with an infertility diagnosis than breast cancer; use of ovulation stimulating drugs does not appear to increase the risk above baseline levels in this patient population. As with breast cancer, increasing risk with increased duration with treatment cannot be ruled out with confidence. There is no available evidence suggesting an increased risk of other cancers with either infertility or infertility treatment. Available data on the incidence of preinvasive and invasive cervical cancer is consistent with increased detection as part of the infertility evaluation.

B. Other outcomes. Based on the available literature, there are no differences in psychological outcomes, including parenting skills, when comparing singleton pregnancies resulting from ART to spontaneous conceptions. If anything, mothers of infants resulting from ART have better outcomes, although there is some evidence that fathers may do worse on some scales. Multiple gestations significantly increase stress and depressive symptoms, especially for mothers of infants with chronic disabilities; to the extent that women undergoing ART are more likely to experience multiples, especially preterm multiples, they are more likely to experience these symptoms. Clearly, further research is needed.

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...