Format

Send to

Choose Destination
Nat Biotechnol. 2018 Jul;36(6):547-551. doi: 10.1038/nbt.4108. Epub 2018 May 7.

Secure genome-wide association analysis using multiparty computation.

Author information

1
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
2
Department of Computer Science, Stanford University, Stanford, California, USA.
3
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.

Abstract

Most sequenced genomes are currently stored in strict access-controlled repositories. Free access to these data could improve the power of genome-wide association studies (GWAS) to identify disease-causing genetic variants and aid the discovery of new drug targets. However, concerns over genetic data privacy may deter individuals from contributing their genomes to scientific studies and could prevent researchers from sharing data with the scientific community. Although cryptographic techniques for secure data analysis exist, none scales to computationally intensive analyses, such as GWAS. Here we describe a protocol for large-scale genome-wide analysis that facilitates quality control and population stratification correction in 9K, 13K, and 23K individuals while maintaining the confidentiality of underlying genotypes and phenotypes. We show the protocol could feasibly scale to a million individuals. This approach may help to make currently restricted data available to the scientific community and could potentially enable secure genome crowdsourcing, allowing individuals to contribute their genomes to a study without compromising their privacy.

PMID:
29734293
PMCID:
PMC5990440
DOI:
10.1038/nbt.4108
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center