Accessing and using data in ClinVar

Disclaimer | Web Access | Data for download | APIs | Release cycle

Disclaimer and data use policy

The information on this website is not intended for direct diagnostic use or medical decision-making without review by a genetics professional. Individuals should not change their health behavior solely on the basis of information contained on this website. NIH does not independently verify the submitted information. If you have questions about the information contained on this website, please see a health care professional. More information about NCBI's disclaimer policy is available.

If you distribute or copy data from ClinVar, we ask that you provide attribution to ClinVar as a data source in publications and websites.

  • You can cite of one the ClinVar publications (e.g. PMID: 29165669'
  • Websites should display the "Powered by NCBI" logo: Powered by NCBI
  • Methods to access the data

    Web access

    The data in ClinVar's web site are updated weekly, on Mondays.  Because the comprehensive data extractions occur monthly, it is possible for data to be visible on the web that are not included in the full extracts.

    There are multiple ways to identify information in ClinVar on the web.  Our help document provides detailed instructions about constructing queries and interpreting the detailed display that ClinVar provides from its site. There are other sites, however, that provide links to ClinVar based on common content.  These include:

    Resource Basis of the link Where the link appears
    Table 1. Some resources linking to ClinVar
    dbSNP rs# represented in ClinVar ClinVar link in the Allele section of the Cluster report
    dbVar nsv represented in ClinVar ClinVar link in the Links to Other Resources section
    GTR Genes in which variation has been reported in ClinVar Gene-specific link under Molecular Resources
    NCBI Gene Human genes in which variation has been reported in ClinVar

    ClinVar link under Related Information

    See variants in ClinVar in the Variation section

    MedGen Conditions or findings reported in ClinVar

    Gene-specific link under Molecular Resources

    ClinVar link under Related Information

    OMIM Allelic variants accessioned in ClinVar Allelic variant section
    PubMed Citations provided in a ClinVar submission ClinVar link under Related Information
    Variation Viewer Locations on the genome for which data are available in ClinVar ClinVar link in the Allele information display

    Data for download

    ClinVar provides files in several formats and with different degrees of coverage. Suggestions for selecting the file that may best meet your needs are summarized in this overview.  The categories of files, with their update cycle, are detailed in Table 2.

    Format and scope Path Coverage Update cycle
    Table 2. Files for download
    XML: Complete public data set ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xml/ comprehensive First Thursday of the month
    VCF:  short variants in ClinVar and dbSNP (see our FAQ) ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf/ partial First Thursday of the month
    TSV: summary data about variants or genes ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/ comprehensive First Thursday of the month
    TSV: disease names and gene-disease relationships ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/ comprehensive daily

    Application programming interface (API)

    Data from ClinVar can be retrieved programmatically via several application programming interfaces (API). These include:

    E-utilities and Entrez Direct

    As part of NCBI's Entrez system, ClinVar can be accessed by E-utilities, both via web services and a UNIX command line as Entrez Direct. The subset of functions ClinVar currently supports are esearch, esummary, elink and efetch.

    The general approach is to use esearch to find the list of unique identifiers that return records of interest, and then use either esummary (to retrieve an overview of each of those records) or efetch (to retrieve the complete record), based on the identifiers returned by esearch.  esearch uses the same query language that you use interactively, so you can test your query on the web before automating as esearch. 

    The default format from eutils is XML, but you can specific json output as well.

    You may note that the content of the document summary that you retrieve by esummary has more data than are displayed on the web.  For example, you can retrieve location data on both GRCh37 and GRCh38 from assembly_set, database identifiers for disorders from trait_refs,  the type of variant from variant_type and the source of any gene-variant relationship from gene/source.  For a discussion of how the data returned from esummary corresponds to objects available by ftp, please use this overview.

    Function Examples
    Use esearch to find unique identifiers of records of interest

    find up to 500 records for the gene FGFR3 (and return results in the default XML format)

    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=clinvarterm=FGFR3[gene]retmax=500

    find 500 records for the gene FGFR3, excluding variations that include multiple genes, by using the property "single_gene", and return results in the default XML format

    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=clinvarterm=FGFR3[gene]+AND+single_gene[prop]retmax=500

    Use esummary to retrieve the document summary of one of the identifiers retrieved in the previous query https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=clinvarid=65533
    Use esummary, and generate a json output

    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=clinvarid=65533retmode=json

    Use elink to determine which databases in NCBI have information related to a specific ClinVar record

    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=clinvarcmd=acheckid=9

    Use elink to retrieve identifiers in a specific NCBI database related to a specific ClinVar record

    PubMed uids
    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=clinvardb=pubmedid=9
    or

    MedGen
    https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=clinvardb=medgenid=9

    Use efetch to retrieve one or more variation records based on their id(s). Note rettype=variation. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=clinvarrettype=variationid=14206,41472
    Use efetch to retrieve the XML for the most recent version of an RCV accession. Note rettype=clinvarset. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=clinvarrettype=clinvarsetid=RCV000000606
    Use efetch to retrieve the XML for a specified version of an RCV accession. Note rettype=clinvarset. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=clinvarrettype=clinvarsetid=RCV000000606.3

     

    Release cycle

    ClinVar provides monthly, archived, comprehensive data releases with release notes, the first Thursday of each month.  To be notified of these releases, or any other changes, please subscribe to our RSS feed.

    The web site is updated weekly on Mondays. Weekly updates to the data release on the ftp site are provided for users who want to keep data synchronized with the ClinVar website. However, the weekly data releases are not archived; only the monthly releases are archived.

Last updated: 2017-03-03T11:29:14-05:00