NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Cover of Accuracy of Data Extraction of Non-English Language Trials with Google Translate

Accuracy of Data Extraction of Non-English Language Trials with Google Translate

Methods Research Reports

Investigators: , MD, MPH, , PhD, MPH, , MS, , MPH, MBA, , PhD, RD, , MD, PhD, and , BS.

Author Information
Rockville (MD): Agency for Healthcare Research and Quality (US); .
Report No.: 12-EHC056-EF

Structured Abstract


Systematic review prides itself on inclusion of all relevant evidence. However, study eligibility is often restricted to English language for practical reasons. Google Translate, a free Web-based resource for translation, has recently become available. However, it is unknown whether its translation accuracy is sufficient for Evidence-based Practice Center (EPC) systematic reviews. Therefore, we formally evaluated the accuracy of Google Translate for the purpose of data extraction of non-English language articles.


We retrieved 10 randomized controlled trials (RCTs) in eight languages (Chinese, French, German, Italian, Japanese, Korean, Portuguese, and Spanish) and eight observational studies in Hebrew. Eligible studies were RCTs that reported per-treatment group results data (except for Hebrew language studies, where no RCTs were identified). Each article was translated into English using Google Translate. The time required to translate each study was tracked. Data from the original language versions of the articles were extracted by one of 10 fluent speakers who were current or former members of our EPC. The English translated versions of the articles were extracted by one of five current EPC researchers who did not speak the given language. These five researchers also double data extracted 10 English language RCTs. Data extracted included: eligibility criteria, treatment description, study descriptors, quality issues, outcome description, and results. Extractors were also asked to estimate how much extra time was required for extraction compared to a similar English language article. For each study, pairs of data extractions were compared for agreement of each extracted item. We analyzed the percent agreement within sets of studies in each language for each extraction item and for groups of extraction items. We defined “high agreement” as at least 80 percent agreement within an item or article. The degree of agreement for each language was compared with that of the English language study comparisons with nonparametric tests.


The length of time required to translate articles ranged from seconds (51 articles, 58 percent) to about 1 hour. Assessment by the English language data extractors indicated that “a little” extra time was required for 40 articles (45 percent) and “a lot” for 42 (48 percent). When evaluating all extraction items together, Portuguese and German articles had the best agreement between original and translated extractions, with high agreement between extractors among about 60 percent of the items, compared with 80 percent in English articles. Spanish, Hebrew, and Chinese had the lowest agreement (30 percent, 24 percent, and 8 percent, respectively). The absolute agreement and the proportion of items with high agreement were statistically significantly worse for all languages, compared with English. Eight of 10 English language articles had high agreement for all items; compared with 7 of 10 Portuguese articles; 6 of 10 German articles; 4 of 10 French, Italian, and Korean; 3 of 8 Hebrew articles; 3 of 10 Japanese and Spanish articles; but no Chinese articles.


Translation was not always possible, but generally required few resources. Across all languages, data extraction from translated articles was less accurate than from English language articles. Accurate extraction was possible for some articles in all languages, except Chinese, with Portuguese and German articles yielding the most accurate extractions. Use of Google Translate has the potential of being an approach to reduce language bias; however, reviewers may need to be more cautious about using data from these translated articles.

Prepared for: Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services1, Contract No. 290-2007-10055 I, Prepared by: Tufts Evidence-based Practice Center, Tufts Medical Center, Boston, MA

Suggested citation:

Balk EM, Chung M, Hadar N, Patel K, Yu WW, Trikalinos TA, Chang L. Accuracy of Data Extraction of Non-English Language Trials With Google Translate. Methods Research Report. (Prepared by the Tufts Evidence-based Practice Center under Contract No. 290-2007-10055 I.) AHRQ Publication No. 12-EHC056-EF. Rockville, MD: Agency for Healthcare Research and Quality. April 2012.

This report is based on research conducted by the Tufts Evidence-based Practice Center (EPC) under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. 290-2007-10055 I). The findings and conclusions in this document are those of the author(s), who are responsible for its content, and do not necessarily represent the views of AHRQ. No statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services.

The information in this report is intended to help health care decisionmakers—patients and clinicians, health system leaders, and policymakers, among others—make well-informed decisions and thereby improve the quality of health care services. This report is not intended to be a substitute for the application of clinical judgment. Anyone who makes decisions concerning the provision of clinical care should consider this report in the same way as any medical reference and in conjunction with all other pertinent information, i.e., in the context of available resources and circumstances presented by individual patients.

This report may be used, in whole or in part, as the basis for development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. AHRQ or U.S. Department of Health and Human Services endorsement of such derivative products may not be stated or implied.

The investigators have no relevant financial interests in the report. The investigators have no employment, consultancies, honoraria, or stock ownership or options, or royalties from any organization or entity with a financial interest or financial conflict with the subject matter discussed in the report.


540 Gaither Road, Rockville, MD 20850; www‚Äč

Bookshelf ID: NBK95238PMID: 22624170


Related information

Similar articles in PubMed

See reviews...See all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...