Send to

Choose Destination
NPJ Genom Med. 2017 Oct 27;2:33. doi: 10.1038/s41525-017-0036-1. eCollection 2017.

A community effort to protect genomic data sharing, collaboration and outsourcing.

Author information

UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA 92093 USA.
Computer Science and Informatics, Indiana University, Bloomington, IN 47408 USA.
GeneCloud, Intertrust, CA, Sunnyvale, CA 94085 USA.
Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC H3A 0G4 Canada.
School of Law, University of San Diego, San Diego, CA 92110 USA.
Cryptography Group, Microsoft Research, San Diego, CA 92122 USA.
Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN 37203 USA.
National Human Genome Research Institute, Rockville, MD 20894 USA.
The J. Craig Venter Institute, La Jolla, CA 92093 USA.


The human genome can reveal sensitive information and is potentially re-identifiable, which raises privacy and security concerns about sharing such data on wide scales. In 2016, we organized the third Critical Assessment of Data Privacy and Protection competition as a community effort to bring together biomedical informaticists, computer privacy and security researchers, and scholars in ethical, legal, and social implications (ELSI) to assess the latest advances on privacy-preserving techniques for protecting human genomic data. Teams were asked to develop novel protection methods for emerging genome privacy challenges in three scenarios: Track (1) data sharing through the Beacon service of the Global Alliance for Genomics and Health. Track (2) collaborative discovery of similar genomes between two institutions; and Track (3) data outsourcing to public cloud services. The latter two tracks represent continuing themes from our 2015 competition, while the former was new and a response to a recently established vulnerability. The winning strategy for Track 1 mitigated the privacy risk by hiding approximately 11% of the variation in the database while permitting around 160,000 queries, a significant improvement over the baseline. The winning strategies in Tracks 2 and 3 showed significant progress over the previous competition by achieving multiple orders of magnitude performance improvement in terms of computational runtime and memory requirements. The outcomes suggest that applying highly optimized privacy-preserving and secure computation techniques to safeguard genomic data sharing and analysis is useful. However, the results also indicate that further efforts are needed to refine these techniques into practical solutions.

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center