Sign in to NCBI

What is NCBI Remap?

Back to NCBI Remap Page

NCBI Remap is a tool that allows users to project annotation data from one coordinate system  to another. This remapping (sometimes called 'liftover') uses genomic alignments to project features from one sequence to the other. For each feature on the source sequence, we perform a base-by-base analysis of each feature on the source sequence in order to project the feature through the alignment to the new sequence.

We support three variations of Remap. Assembly-Assembly allows the remapping of features from one assembly to another. RefSeqGene allows for the remapping of features from assembly sequences to RefSeqGene sequences (including transcript and protein sequences annoted on the RefSeqGene) or from RefSeqGene sequences to an assembly. Alt loci remap allows for the mapping of features between the Primary assembly unit and the Alternate Loci and Patches assembly units available for GRC assemblies.

You can view a short video describing how to use remap here:

What's new

March 2015 Update

  • Added features:
    • New web interface for Assembly-Assembly alignment provides greater assembly detail, making it easier to distinguish similarly named assemblies
    • Improved support for remapping of variations in VCF files. See the Remapping Variation Data section on this page for details.
    • Improved reporting for features that do not remap. See the Mapping Report section on this page for details.
  • Bug fixes
    • VCF formatting: All entries for a given seq-id now form a continuous block in output VCF, consistent with VCF specifications
  • Known Issues:
    • If multiple ALT alleles are specified in a single row of an input VCF, only 1 allele is reported in output VCF. To avoid this situation, only specify a single ALT allele per VCF row. This will be fixed in a future release.
    • Remap may crash if a feature meets all three of the following conditions: Input format=VCF, feature requires left-shifting, feature remaps to multiple locations on the same sequence-id with multiple second-pass alignments. This bug, which affects only a small minority of features, will be fixed in the next release. In the interim, if you encounter this issue, we suggest use of a different input file type or ensuring that variants are already left-shifted in the input VCF file. We apologize for the inconvenience.

April 2014 Update

  • Added features:
    • Assembly accessions now provided as tool-tips in the target and source assembly drop-down menus. These provide users with unambiguous identifiers for the assemblies used in their remapping effort.
    • Assemblies in target and source menus are sorted by assembly and release version to facilitate identification of assembly of choice.
    • Construction of pre-configured URLs: users can now construct URLs with specified remapping parameters that can be bookmarked or used as links to NCBI Remap.
  • Bug fixes:
    • Remapped locations missing from VCF, GVF or HGVS file output (but present in report files).
    • Data remapped from alt-loci or patches to chromosomes reported on scaffolds rather than chromosomes.
    • Further correction to alt-loci remap

August 2013 Update

  • Added features:
    • Limited support for LRG sequences in Clinical Remap. We currently only support the current versions of LRG and RefSeqGene sequences. Support for older sequences will be added in a future update.
  • Bug fixes:
    • Inappropriate duplication of variants lines when using a VCF with multiple alterante alleles.
    • Dropping of scores from GTF files.
    • Alt locus remap was fixed.

November 2012 Update

  • Added features:
    • Alt locus remap: remap features between the primary assembly and the alternate loci/patches in GRC assemblies.
    • Clinical Remap: When you run this we will now make a call to the variation reporter and insert the results into Clincal Remap.
    • Added support for upload of compressed files. Currently GZip (.gz) and BZip2 (.bz) files are supported.
    • Improved HGVS nomenclature.

Specifying the data


In order to use the NCBI Remap service, you must select the organism of interest, the assembly your features are on (Source Assembly) and the assembly on which you wish to project these features (Target Assembly). If you would like to request additional organisms or assemblies to be added to the list, please use the Write to the Help Desk to make this request.

List of supported assembly-assembly alignments in remap:

Organism Source Assembly Target Assembly Software version Last Updated
Cricetulus griseus Cgr1.0 CriGri_1.0 1.7 09/23/2014 22:15:26
Cricetulus griseus C_griseus_v1.0 CriGri_1.0 1.7 09/23/2014 22:18:29
Mus musculus MGSCv37 GRCm38 1.7 09/23/2014 14:14:17
Mus musculus MGSCv34 GRCm38 1.7 09/23/2014 14:20:37
Mus musculus MGSCv34 GRCm38.p2 1.7 09/23/2014 14:41:16
Mus musculus MGSCv3 MGSCv37 1.7 09/23/2014 14:44:47
Mus musculus MGSCv3 GRCm38 1.7 09/23/2014 14:52:36
Mus musculus Mm_Celera MGSCv37 1.7 09/23/2014 15:12:26
Mus musculus Mm_Celera MGSCv37 1.7 09/23/2014 15:26:36
Mus musculus Mm_Celera GRCm38 1.7 09/23/2014 15:35:52
Mus musculus Mm_Celera GRCm38 1.7 09/23/2014 15:47:02
Mus musculus MGSCv37 GRCm38.p2 1.7 09/23/2014 16:14:13
Mus musculus MGSCv36 GRCm38.p2 1.7 09/23/2014 16:14:13
Mus musculus MGSCv35 GRCm38.p2 1.7 09/23/2014 16:14:30
Mus musculus MGSCv3 GRCm38.p2 1.7 09/23/2014 16:20:33
Mus musculus GRCm38 GRCm38.p2 1.7 09/23/2014 16:21:35
Mus musculus GRCm38.p1 GRCm38.p2 1.7 09/23/2014 16:22:53
Mus musculus MGSCv36 GRCm38.p3 1.7 09/23/2014 16:44:44
Mus musculus mm129svJae1.0 GRCm38.p2 1.7 09/23/2014 17:01:43
Mus musculus MmusALLPATHS2 GRCm38.p2 1.7 09/23/2014 17:06:23
Mus musculus MGSCv35 GRCm38.p3 1.7 09/23/2014 18:09:04
Mus musculus GRCm38.p1 GRCm38.p3 1.7 09/23/2014 18:30:44
Mus musculus GRCm38 GRCm38.p3 1.7 09/23/2014 18:32:56
Mus musculus GRCm38.p2 GRCm38.p3 1.7 09/23/2014 18:40:12
Mus musculus mm129svJae1.0 GRCm38.p3 1.7 09/23/2014 18:54:50
Mus musculus MmusALLPATHS2 GRCm38.p3 1.7 09/23/2014 18:56:53
Mus musculus Mm_Celera GRCm38.p2 1.7 09/23/2014 18:58:20
Mus musculus MGSCv37 GRCm38.p3 1.7 09/23/2014 19:14:47
Mus musculus MGSCv34 GRCm38.p3 1.7 09/23/2014 19:17:48
Mus musculus Mm_Celera GRCm38.p2 1.7 09/23/2014 19:52:44
Mus musculus MGSCv3 GRCm38.p3 1.7 09/23/2014 20:17:22
Mus musculus Mm_Celera GRCm38.p3 1.7 09/23/2014 20:50:37
Mus musculus Mm_Celera GRCm38.p3 1.7 09/23/2014 21:08:19
Mus musculus Mm_Celera Mm_Celera 1.7 09/23/2014 22:44:22
Mus musculus MGSCv35 MGSCv37 1.7 09/23/2014 23:11:19
Mus musculus MGSCv35 GRCm38 1.7 09/23/2014 23:12:41
Mus musculus MGSCv34 MGSCv37 1.7 09/23/2014 23:18:07
Mus musculus MGSCv36 MGSCv37 1.7 09/23/2014 23:31:57
Mus musculus MGSCv36 GRCm38 1.7 09/23/2014 23:41:30
Rattus norvegicus Rn_Celera Rn_Celera 1.7 09/23/2014 18:09:44
Rattus norvegicus Rn_Celera Rnor_5.0 1.7 09/24/2014 01:27:08
Rattus norvegicus Rnor_5.0 Rnor_6.0 1.7 09/24/2014 01:42:49
Rattus norvegicus RGSC_v3.4 Rnor_5.0 1.7 09/24/2014 01:43:08
Rattus norvegicus Rn_Celera Rnor_5.0 1.7 09/24/2014 02:25:15
Rattus norvegicus Rn_Celera Rnor_6.0 1.7 09/24/2014 02:59:06
Rattus norvegicus Rn_Celera Rnor_6.0 1.7 09/24/2014 03:10:04
Rattus norvegicus RGSC_v3.4 Rnor_6.0 1.7 09/24/2014 03:50:18
Rattus norvegicus Rn_Celera RGSC_v3.4 1.7 09/24/2014 17:20:38
Rattus norvegicus Rn_Celera RGSC_v3.4 1.7 09/24/2014 17:38:38
Heterocephalus glaber HetGla_1.0 HetGla_female_1.0 1.7 09/24/2014 15:37:11
Maylandia zebra MetZeb1.1 M_zebra_UMD1 1.7 10/13/2015 13:51:45
Amborella trichopoda AMTR1.0 AMTR1.0 1.7 02/24/2015 09:54:33
Vitis vinifera 8x_WGS 12X 1.7 09/23/2014 17:54:43
Vitis vinifera 12X 12X 1.7 11/21/2014 15:46:38
Vitis vinifera 8x_WGS 12X 1.7 01/02/2015 11:06:57
Cucumis sativus CSB10A_v1 CucSat_1.0 1.7 09/23/2014 06:23:38
Cucumis sativus CucSat_1.0 ASM407v2 1.7 01/30/2015 09:44:20
Cucumis sativus CucSat_1.0 ASM407v2 1.7 03/13/2015 13:33:24
Pan troglodytes verus CCYSCv1 Pan_troglodytes-2.1.4 1.7 09/26/2014 21:14:05
Arabidopsis thaliana TAIR9 TAIR10 1.7 09/20/2014 13:44:43
Arabidopsis thaliana TAIR8 TAIR10 1.7 09/20/2014 13:45:50
Arabidopsis thaliana TAIR7 TAIR10 1.7 10/03/2014 20:51:21
Oryza sativa Japonica Group IRGSP_3.0 Build 4.0 1.7 09/23/2014 17:52:34
Solanum lycopersicum SL2.40 SL2.50 1.7 11/14/2014 16:39:52
Solanum lycopersicum SL2.40 SL2.50 1.7 11/14/2014 17:13:39
Zea mays B73 RefGen_v1 B73 RefGen_v3 1.7 10/16/2015 17:39:14
Zea mays B73 RefGen_v1 B73 RefGen_v3 1.7 10/16/2015 18:25:53
Zea mays B73 RefGen_v2.gaps B73 RefGen_v3 1.7 10/16/2015 20:34:53
Zea mays B73 RefGen_v2 B73 RefGen_v3 1.7 10/16/2015 20:36:08
Drosophila pseudoobscura pseudoobscura Dpse_2.0 Dpse_3.0 1.7 09/25/2014 08:56:36
Saccharomyces cerevisiae S288c SacCer_May2010 R64 1.7 10/10/2014 22:38:56
Saccharomyces cerevisiae W303 ASM29281v1 R64 1.7 10/11/2014 13:36:20
Hydra vulgaris h7 Hydra_RP_1.0 1.7 10/03/2014 22:23:24
Nomascus leucogenys Nleu1.0 Nleu_3.0 1.7 09/24/2014 01:03:29
Caenorhabditis elegans WS195 WBcel215 1.7 09/23/2014 05:42:56
Caenorhabditis elegans WBcel215 WBcel235 1.7 09/23/2014 05:43:05
Caenorhabditis elegans WS190 WBcel215 1.7 09/23/2014 05:43:17
Caenorhabditis elegans WS195 WBcel235 1.7 09/23/2014 05:44:04
Caenorhabditis elegans WS190 WBcel235 1.7 09/23/2014 05:44:40
Microplitis demolitor Mdem1 Mdem2 1.7 10/14/2015 13:33:12
Acyrthosiphon pisum Acyr_1.0 Acyr_2.0 1.7 09/23/2014 06:37:27
Drosophila melanogaster Release 6 plus MT Release 6 plus ISO1 MT 1.7 09/20/2014 13:46:46
Drosophila melanogaster Release 5 Release 6 plus ISO1 MT 1.7 09/20/2014 13:47:36
Drosophila melanogaster Release 5 Release 6 plus MT 1.7 09/20/2014 13:58:34
Drosophila melanogaster Release 4 Release 6 plus ISO1 MT 1.7 03/04/2015 18:44:57
Drosophila simulans dsim_caf1 ASM75419v2 1.7 09/24/2014 15:21:06
Nasonia vitripennis Nvit_1.0 Nvit_2.0 1.7 09/23/2014 06:13:26
Nasonia vitripennis Nvit_1.0 Nvit_2.1 1.7 09/23/2014 06:18:29
Nasonia vitripennis Nvit_2.0 Nvit_2.1 1.7 09/23/2014 06:18:45
Nasonia giraulti Ngir_1.0 Nvit_2.1 1.7 09/23/2014 06:36:45
Nasonia longicornis Nlon_1.0 Nvit_2.1 1.7 09/23/2014 06:32:52
Apis mellifera Amel_2.0 Amel_4.5 1.7 09/23/2014 07:51:32
Apis mellifera Amel_4.0 Amel_4.5 1.7 09/23/2014 18:39:48
Strongylocentrotus purpuratus Spur_2.6 Spur_3.1 1.7 09/23/2014 19:34:19
Strongylocentrotus purpuratus Spur_v2.1 Spur_3.1 1.7 09/23/2014 19:34:21
Strongylocentrotus purpuratus Spur_0.5 Spur_3.1 1.7 09/23/2014 19:47:18
Strongylocentrotus purpuratus Spur_3.1 Spur_4.2 1.7 03/18/2015 17:40:25
Ciona intestinalis KH KH 1.7 10/18/2014 05:40:24
Ciona intestinalis v1.0 KH 1.7 10/03/2014 21:03:35
Ciona intestinalis v1.0 KH 1.7 11/02/2014 10:24:19
Danio rerio Zv7 Zv9 1.7 09/23/2014 19:14:05
Danio rerio Zv8 Zv9 1.7 09/23/2014 19:18:31
Danio rerio Zv9 GRCz10 1.7 09/24/2014 22:24:19
Danio rerio Zv7 GRCz10 1.7 09/30/2014 15:31:13
Danio rerio Zv8 GRCz10 1.7 09/30/2014 15:50:20
Esox lucius EsoLuc1.0 ASM72191v2 1.7 07/08/2015 17:46:47
Oreochromis niloticus Orenil1.0 Orenil1.1 1.7 09/23/2014 18:51:23
Xenopus (Silurana) tropicalis v4.2 Xtropicalis_v7 1.7 09/23/2014 20:03:33
Chrysemys picta bellii Chrysemys_picta_bellii-3.0.1 Chrysemys_picta_bellii-3.0.3 1.7 09/23/2014 20:23:36
Gallus gallus Gallus_gallus-2.1 Gallus_gallus-4.0 1.7 09/23/2014 19:43:22
Meleagris gallopavo Turkey_2.01 Turkey_5.0 1.7 12/10/2014 01:13:42
Macaca fascicularis CE_1.0 Macaca_fascicularis_5.0 1.7 10/18/2014 03:27:43
Macaca fascicularis MacFas_Jun2011 Macaca_fascicularis_5.0 1.7 10/18/2014 23:24:51
Macaca mulatta CR_1.0 Mmul_051212 1.7 10/18/2014 09:48:15
Macaca mulatta MacaM_Assembly_v7 Mmul_051212 1.7 11/24/2014 19:34:53
Pan paniscus panpan1 panpan1.1 1.7 09/28/2015 17:29:17
Pan troglodytes Pan_troglodytes-2.1.3 Pan_troglodytes-2.1.4 1.7 09/23/2014 19:31:37
Pan troglodytes Pan_troglodytes-2.1.3 Pan_troglodytes-2.1.4 1.7 09/23/2014 19:43:51
Pan troglodytes Pan_troglodytes-2.1.4 Pan_troglodytes-2.1.4 1.7 09/24/2014 02:08:40
Pan troglodytes Pan_troglodytes-2.1 Pan_troglodytes-2.1.4 1.7 09/24/2014 05:07:19
Homo sapiens NCBI34 GRCh38.p1 1.7 12/15/2014 22:44:42
Homo sapiens NCBI35 GRCh38.p1 1.7 12/15/2014 22:48:04
Homo sapiens NCBI36 GRCh38.p1 1.7 12/15/2014 22:51:24
Homo sapiens GRCh37.p2 GRCh38.p1 1.7 12/15/2014 22:52:22
Homo sapiens GRCh37.p12 GRCh38.p1 1.7 12/15/2014 23:03:11
Homo sapiens GRCh37.p10 GRCh38.p1 1.7 12/15/2014 23:03:13
Homo sapiens NCBI33 GRCh38.p1 1.7 12/15/2014 23:03:36
Homo sapiens GRCh37.p9 GRCh38.p1 1.7 12/15/2014 23:04:06
Homo sapiens GRCh37.p13 GRCh38.p1 1.7 12/15/2014 23:04:22
Homo sapiens GRCh37.p11 GRCh38.p1 1.7 12/15/2014 23:05:25
Homo sapiens GRCh37 GRCh38.p1 1.7 12/15/2014 23:07:05
Homo sapiens GRCh37.p5 GRCh38.p1 1.7 12/15/2014 23:17:53
Homo sapiens CRA_TCAGchr7v2 GRCh38 1.7 09/20/2014 14:40:01
Homo sapiens NCBI33 GRCh38 1.7 09/20/2014 15:19:24
Homo sapiens NCBI34 GRCh38 1.7 09/20/2014 15:23:51
Homo sapiens NCBI35 GRCh38 1.7 09/20/2014 15:25:43
Homo sapiens GRCh37 GRCh38 1.7 09/20/2014 15:29:12
Homo sapiens GRCh37.p2 GRCh38 1.7 09/20/2014 15:33:05
Homo sapiens NCBI36 GRCh38 1.7 09/20/2014 15:33:15
Homo sapiens GRCh37.p11 GRCh38 1.7 09/20/2014 15:34:07
Homo sapiens GRCh37.p10 GRCh38 1.7 09/20/2014 15:34:42
Homo sapiens GRCh37.p13 GRCh38 1.7 09/20/2014 15:35:18
Homo sapiens GRCh37.p12 GRCh38 1.7 09/20/2014 15:35:24
Homo sapiens GRCh37.p5 GRCh38 1.7 09/20/2014 15:37:20
Homo sapiens GRCh37.p9 GRCh38 1.7 09/20/2014 15:41:29
Homo sapiens HuRef GRCh38 1.7 09/20/2014 16:15:10
Homo sapiens YH_2.0 GRCh38 1.7 09/20/2014 16:16:31
Homo sapiens CHM1_1.0 GRCh38 1.7 09/20/2014 16:17:59
Homo sapiens CHM1_1.1 GRCh38 1.7 09/20/2014 20:18:56
Homo sapiens NCBI34 NCBI36 1.7 09/23/2014 08:54:02
Homo sapiens NCBI33 GRCh37.p10 1.7 09/23/2014 09:57:13
Homo sapiens GRCh37.p9 GRCh37.p10 1.7 09/23/2014 10:06:30
Homo sapiens HuRef GRCh37.p10 1.7 09/23/2014 11:00:35
Homo sapiens CHM1_1.0 GRCh37.p10 1.7 09/23/2014 11:01:56
Homo sapiens CHM1_1.0 CHM1_1.1 1.7 09/23/2014 11:13:15
Homo sapiens HuRef GRCh37.p13 1.7 09/23/2014 11:18:32
Homo sapiens CHM1_1.0 GRCh37.p13 1.7 09/23/2014 11:20:04
Homo sapiens YH_2.0 GRCh37.p13 1.7 09/23/2014 11:36:12
Homo sapiens CHM1_1.1 GRCh37.p10 1.7 09/23/2014 14:04:46
Homo sapiens CHM1_1.1 GRCh37 1.7 09/23/2014 15:33:08
Homo sapiens CHM1_1.1 GRCh37.p13 1.7 09/23/2014 16:37:40
Homo sapiens HuRefPrime HuRef 1.7 09/23/2014 19:27:09
Homo sapiens NCBI33 NCBI35 1.7 09/23/2014 19:49:41
Homo sapiens NCBI34 NCBI35 1.7 09/23/2014 19:49:46
Homo sapiens NCBI35 NCBI36 1.7 09/23/2014 20:06:50
Homo sapiens NCBI33 NCBI36 1.7 09/23/2014 20:17:17
Homo sapiens NCBI36 GRCh37 1.7 09/23/2014 20:18:46
Homo sapiens NCBI35 GRCh37 1.7 09/23/2014 20:21:23
Homo sapiens NCBI33 GRCh37 1.7 09/23/2014 20:26:42
Homo sapiens NCBI34 GRCh37 1.7 09/23/2014 20:42:21
Homo sapiens NCBI35 GRCh37.p10 1.7 09/23/2014 20:48:16
Homo sapiens NCBI36 GRCh37.p13 1.7 09/23/2014 20:54:41
Homo sapiens NCBI34 GRCh37.p10 1.7 09/23/2014 20:56:48
Homo sapiens NCBI35 GRCh37.p13 1.7 09/23/2014 20:58:24
Homo sapiens NCBI36 GRCh37.p10 1.7 09/23/2014 20:58:57
Homo sapiens NCBI34 GRCh37.p13 1.7 09/23/2014 21:01:38
Homo sapiens NCBI33 GRCh37.p13 1.7 09/23/2014 21:07:57
Homo sapiens YH_2.0 GRCh37 1.7 09/23/2014 21:16:06
Homo sapiens GRCh37.p10 GRCh37.p13 1.7 09/23/2014 21:19:03
Homo sapiens CHM1_1.0 GRCh37 1.7 09/23/2014 21:31:22
Homo sapiens HuRef GRCh37 1.7 09/23/2014 21:40:49
Homo sapiens CHM1_1.0 HuRef 1.7 09/23/2014 22:14:46
Homo sapiens CHM1_1.1 HuRef 1.7 09/23/2014 22:21:20
Homo sapiens NCBI34 NCBI34 1.7 09/24/2014 15:16:42
Homo sapiens NCBI33 NCBI34 1.7 09/24/2014 15:28:50
Homo sapiens CRA_TCAGchr7v2 NCBI34 1.7 10/03/2014 20:03:27
Homo sapiens CRA_TCAGchr7v2 GRCh37 1.7 10/03/2014 20:43:37
Homo sapiens CRA_TCAGchr7v2 GRCh37.p10 1.7 10/03/2014 20:43:44
Homo sapiens CRA_TCAGchr7v2 GRCh37.p13 1.7 10/03/2014 20:45:43
Homo sapiens CRA_TCAGchr7v2 NCBI35 1.7 10/03/2014 20:46:11
Homo sapiens CRA_TCAGchr7v2 NCBI36 1.7 10/03/2014 21:02:31
Homo sapiens CRA_TCAGchr7v2 HuRef 1.7 10/03/2014 21:36:54
Homo sapiens CRA_TCAGchr7v2 CHM1_1.1 1.7 10/03/2014 21:43:23
Homo sapiens Hs_Celera GRCh37 1.7 10/03/2014 22:08:04
Homo sapiens Hs_Celera GRCh37.p13 1.7 10/03/2014 22:13:56
Homo sapiens Hs_Celera GRCh38 1.7 10/03/2014 23:22:06
Homo sapiens HuRef GRCh38.p1 1.7 11/25/2014 19:28:16
Homo sapiens GRCh38 GRCh38.p1 1.7 12/01/2014 12:54:56
Homo sapiens GRCh37 GRCh38.p2 1.7 12/17/2014 22:16:53
Homo sapiens NCBI34 GRCh38.p2 1.7 12/17/2014 22:17:16
Homo sapiens NCBI33 GRCh38.p2 1.7 12/17/2014 22:17:42
Homo sapiens GRCh37.p5 GRCh38.p2 1.7 12/17/2014 22:24:39
Homo sapiens GRCh37.p12 GRCh38.p2 1.7 12/17/2014 22:24:39
Homo sapiens GRCh37.p13 GRCh38.p2 1.7 12/17/2014 22:26:51
Homo sapiens GRCh37.p10 GRCh38.p2 1.7 12/17/2014 22:34:33
Homo sapiens NCBI35 GRCh38.p2 1.7 12/17/2014 22:36:17
Homo sapiens GRCh37.p11 GRCh38.p2 1.7 12/17/2014 22:38:19
Homo sapiens GRCh37.p9 GRCh38.p2 1.7 12/17/2014 22:43:58
Homo sapiens GRCh37.p2 GRCh38.p2 1.7 12/17/2014 22:44:10
Homo sapiens HuRef GRCh38.p2 1.7 12/17/2014 23:04:22
Homo sapiens YH_2.0 GRCh38.p2 1.7 12/17/2014 23:06:06
Homo sapiens NCBI36 GRCh38.p2 1.7 12/17/2014 23:24:14
Homo sapiens CHM1_1.1 GRCh38.p2 1.7 12/18/2014 03:33:06
Homo sapiens CHM1_1.1 ASM77258v3 1.7 02/02/2015 17:21:51
Homo sapiens GRCh38 GRCh38.p2 1.7 02/20/2015 19:59:00
Canis lupus familiaris CanFam2.0 CanFam3.1 1.7 09/23/2014 21:21:03
Mustela putorius furo MusPutFurMale1.0 MusPutFur1.0 1.7 09/23/2014 10:39:55
Felis catus catChrV17e Felis_catus-6.2 1.7 09/24/2014 07:07:56
Felis catus Felis_catus-6.2 Felis_catus_8.0 1.7 02/26/2015 19:36:28
Felis catus catChrV17e Felis_catus_8.0 1.7 04/04/2015 04:17:45
Sus scrofa Sscrofa10 Sscrofa10.2 1.7 09/23/2014 11:43:57
Sus scrofa Sscrofa5 Sscrofa10 1.7 09/23/2014 12:09:57
Sus scrofa Sscrofa5 Sscrofa10.2 1.7 09/23/2014 12:13:13
Sus scrofa Sscrofa9.2 Sscrofa10.2 1.7 09/23/2014 12:55:39
Sus scrofa Sscrofa9.2 Sscrofa10 1.7 09/23/2014 14:13:24
Bos taurus Btau_3.1 Btau_4.2 1.7 09/23/2014 12:34:56
Bos taurus Btau_4.2 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 12:49:47
Bos taurus Btau_4.0 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 12:59:33
Bos taurus Btau_4.6.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 13:00:02
Bos taurus Btau_3.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 13:00:45
Bos taurus Bos_taurus_UMD_3.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 21:28:03
Bos taurus Btau_4.0 Btau_4.2 1.7 09/23/2014 21:56:15
Bos taurus Btau_4.2 Btau_4.6.1 1.7 09/23/2014 22:30:39
Bos taurus Btau_4.0 Btau_4.6.1 1.7 09/23/2014 22:33:17
Bos taurus Btau_3.1 Btau_4.6.1 1.7 09/23/2014 22:48:23
Bos taurus Btau_4.0 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:11:23
Bos taurus Btau_4.2 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:27:24
Bos taurus Btau_4.6.1 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:28:30
Bos taurus Btau_3.1 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:54:29
Bos taurus Bos_taurus_UMD_3.1 Bos_taurus_UMD_3.1.1 1.7 12/19/2014 17:27:36
Bos taurus Btau_4.6.1 Bos_taurus_UMD_3.1.1 1.7 12/19/2014 18:46:34
Bos taurus Btau_4.0 Bos_taurus_UMD_3.1.1 1.7 01/02/2015 12:06:10
Bos taurus Btau_3.1 Bos_taurus_UMD_3.1.1 1.7 01/02/2015 12:07:07
Bos taurus Btau_4.2 Bos_taurus_UMD_3.1.1 1.7 01/02/2015 13:42:01
Bos taurus UMD Bos_taurus 2.0 Bos_taurus_UMD_3.1.1 1.7 11/09/2015 12:05:13

Clinical Remap

Only human is supported for the RefSeqGene tab, so you only need to select the sequence upon which your features are annotated (either an assembly or RefSeqGenes) and the sequences to which you want the features mapped (either RefSeqGenes or an assembly). 

Alt loci remap

Alt loci remap allows you to map data between the Primary Assembly and the Alternate Loci/Patches that may be available for an assembly. Only assemblies produced by the Genome Reference Consortium are supported on this page. All you need to select on this page are the organism and the assembly; the software will figure out the direction in which you want to map. Within a given input file, however, all features to be remapped should map in the same direction (e.g. primary to alt OR alt to primary).

NOTE: For both Clinical Remap and Alt loci remap if you map FROM an assembly to either the RefSeqGenes or the Alternate Loci/Patches, you may have a lot of failed features as both of these sequences only cover a fraction of the genome. Features on source sequences that are not part of an alignment set will be marked as "NOMAP/NOTINSET" in the alignment report. To see genome coverage for Alternate Loci/Patches see the GRC pages for human and mouse.

Remapping Options

Some configuration options are available that allow you to configure the stringency of remapping. These options are only configurable in the Assembly-Assembly tab.

  1. Minimum ratio of bases that must be remapped (default: 0.5): This option specifies the percentage of the interval that must be able to be remapped. Raising this value increases the stringency of the remapping process.
  2. Maximum ratio for difference between the source length and the target length (default 2.0): This feature allows the remapping algorithm to tolerate insertions and deletions in the alignment. This is calculated by taking the interval length on the target assembly (stop-start+1) and dividing it by the interval length on the source assembly (stop-start+1). An insertion or deletion in the target assembly will affect this ratio. Lowering this value will increase the stringency of the remapping process.
  3. Allow multiple locations to be returned (default: on): We perform alignments in two phases (see 'About our alignments'). Selecting this option will allow the 'Second Pass' alignments to be used and improve coordinate projection in regions of duplication. This can also lead to multiple features being remapped to the same location.
  4. Merge Fragments (default: on): An insertion in the target assembly will split a feature on the source assembly, selecting this option will merge these two locations into a single location in the annotation file. Turning this feature off will increase the stringency of the remapping process, specifically in cases where there is an insertion in the target sequence as each remapped interval will be compared to the original interval.

The merge function can help you remap features that cross an assembly gap, or have a large insertion that causes a gap in the alignment.

Example of a feature crossing a gap

Figure 1: A region with a feature that crosses an assembly gap. This feature was successfully remapped because the merge function was on.

However, in regions with messy alignments, the merge function can cause a feature to be remapped to the same, or overlapping positions. This only happens when using the Second Pass alignments for remapping as these alignments are not guaranteed to be unique.

Region with complicated alignments in the second pass.

Figure 2: A region with nice First Pass alignments and many Second Pass alignments. 

Using the merge function, this feature remaps to six locations in GRCh37, one using the First Pass alignments and five using the Second Pass. These are easily distinguished using the remap report as the 'recip' column specifies whether the first pass or second pass alignments were used.

remap report for feature in region with complicated second pass alignments.

Figure 3: Remap report for feature with multiple locations returned due to complicated second pass alignments.

These features are relatively easy to identify in a post-processing step, or you can turn the merge function off. This will, however, negatively affect features that cross a gap. You may need to review the alignments (which you can do using the Genome Workbench project files) to determine the best course of action.

Note: Alignments are processed in a strand-specific manner. If a feature aligns to a region for which there are alignments on both strands, you may get a placement returned for the plus and the minus strand. Using the merge feature may increase the chances of this as merge helps to span alignment gaps. Turning merge off will cause a decrease in remapped features as gaps will not be crossed on either strand.

Configuring Remapping Parameters via URLs

There are several parameters that can be added to the NCBI Remap URL to pre-configure the mapping parameters that will be used.

Parameter Definition Example
tab remapping category (allowed values: asm|rsg|alt-loci). Specifies the type of remapping to be done: assembly-assembly, clinical or alt loci.
src_org Organism. Allowed values: scientific (binomial) name of organism. mellifera
src_asm Source assembly. Allowed value: assembly accession.version. mellifera&src_asm=GCF_000002195.1
tgt_asm Target assembly. Allowed value: assembly accession.version. mellifera&src_asm=GCF_000002195.1&tgt_asm=GCF_000002195.4
min_ratio Minimum ratio of bases that must be remapped. Allowed value: positive number. If not provided, defaults to 0.5. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5
max_ratio Maximum ratio for difference between source length and target length. Allowed value: positive number. If not provided, defaults to 2.0. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&max_ratio=2.0
allow_locations Allow multiple locations to be returned. Allowed values: true|false. If not provided, defaults to true. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&allow_locations=true
merge_fragments Merge split features into a single location in the annotation file. Allowed values: true|false. If not provided, defaults to true. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&merge_fragments=true
data_from Identifies source of data to be remapped. Allowed values: assembly accession.version|LRG|RefSeqGene. Only valid for tab=rsg.
data_to Identifies target of data to be remapped. Allowed values: assembly accession.version|LRG|RefSeqGene. Only valid for tab=rsg.
any_refseq Set of RefSeqs or LRGs to which data should be remapped (any or user-specified). Allowed values: true|false. Only valid if tab=rsg and data_from=acc.ver.
without_refseq Provide remapped locations on NMs/NPs even if there is no RefSeqGene or LRG. Allowed values: true|false. Only valid if tab=rsg and data_from=acc.ver.
with_refseq Provide remapped locations on NMs/NPs associated with RefSeqGenes or LRGs. Allowed values: true|false. Only valid if tab=rsg.
in_fmt Format of input file. Allowed values: guess|hgvs|bed|gff|gff3|gvf|gtf|vcf|asnt|asnb|region. If not provided, defaults to guess. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5&max_ratio=2.0&allow_locations=true&merge_fragments=true&in_fmt=bed
out_fmt Format of output annotation file. Allowed values: guess|hgvs|bed|gff|gff3|gvf|gtf|vcf|asnt|asnb|region. If not provided, defaults to guess. sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5&max_ratio=2.0&allow_locations=true&merge_fragments=true&in_fmt=region&out_fmt=region

Providing Data

We accept file formats that are commonly used in the bioinformatics community. We currently accept:

The default behavior is to provide the remapped annotation file in the same format as the input file, but you can specify a different format for the output.
If you have a small amount of data, you can just copy and paste the data in the large text box labeled 'Paste data here'. For example:


Otherwise, you can upload the data file.
Please note: the larger your file is, the longer it will take to perform the remapping process. If you find that the process is taking a very long time or failing, you may want to split your files into smaller ones, perhaps based on chromosome assignment. There is also an absolute limit on the amount of RAM available to the system. If this is exceeded, Remap will fail. If this happens, try again with a smaller file. 

Data options specific to the Clinical Remap tab

Mapping from a RefSeqGene(s) to an assembly: In this case, an additional option is provided and checked by default. This allows the remap service to return features on genomic sequences as well as any transcripts (NMs) or proteins (NPs) available at that locus.

Mapping from an assembly to RefSeqGenes: In this case, you have the option to map to any available RefSeqGene (default) or you can specify a list of RefSeqGenes as targets. If you choose to map to any available RefSeqGene, there are two additional options for providing locations on transcripts (NMs) or proteins. One is to provide the transcript (NM) and protein (NP) locations for features that map to RefSeqGenes and the other is to provide transcript (NM) and protein (NP) locations even if there isn't a RefSeqGene where your feature maps. Not all genes in the human genome have a RefSeqGene. There is a link on this page that allows you to request the construction of a RefSeqGene if one is not available for your gene of interest.

Output files

Summary Data: This is a global report that provides an overview of remapping results. The format of the report is (by column):

  • ID: The sequence ID in the source assembly (often something like 'chr1' or NC_000001.9).
  • Source Features: The number of features on the ID in the source file.
  • Remapped Features: The number of features that could be projected onto the Target assembly.
  • Source Intervals: The number of intervals on the ID in the source file. This happens because some features will have more than one sequence interval, for example, mRNA features will often have multiple intervals (corresponding to exons).
  • Remapped Intervals: The number of intervals that could be projected onto the Target assembly.

The summary data appears on the web page and is available for download.

Mapping Report: This is a report that provides a feature-by-feature breakdown of the remapping status. The format of this report on the web page is (by column):

  • Feature: The name or ID of the feature (the source of this will depend on the format submitted, but it should be possible to robustly associate the information in this column with the data in the input file).
  • Src. Intervals: Number of intervals the feature has in the source file.
  • Remap Intervals: Number of intervals that were projected to the target assembly.
  • Src location: The feature location in the input file.
  • Src length: The length of the feature in the input file.
  • Map Location: Projected location (or reason that the remap failed) on the target assembly.
  • Map length: Length of the feature on the target assembly.
  • Coverage: Coverage of feature on the target assembly.

Only a few lines of this report are displayed on the web page, but the entire report is available for download in a tab separated file (tsv) that can be easily parsed or loaded to spreadsheet program. The downloaded report has 18 columns as follows. Note: If the merge option is selected, all of these fields contain post-merge values:

  1. #feat_name: user-supplied feature name. If no feature name is supplied, a name is calculated using the line number in the file or the location. For features with multiple intervals (e.g. transcripts), this field will be common to each interval.
  2. source_int: The number of intervals in the source feature (useful for tracking features with multiple intervals, like transcripts). For single-interval features, the value is always 1.
  3. mapped_int: the number of mapped intervals in the remapped file from the source interval. Values >1 indicate a fragmented mapping.
  4. source_id: sequence identifier the feature maps to in the source file.
  5. mapped_id: sequence identifier the features maps to on the target assembly.
  6. source_length: length of the interval on the source assembly.
  7. mapped_length: length of the interval on the target assembly.
  8. source_start: first base of the interval on the source assembly.
  9. source_stop: last base of the interval on the source assembly.
  10. source_strand: strand the interval is annotated on in the source assembly.
  11. source_sub_start: first base of source sub-interval that was mapped (used only if entire source interval does not remap and the front edge of the source interval does not map).
  12. source_sub_start: last base of source sub-interval that was mapped (used only if entire source interval does not remap and the back edge of the source interval does not map).
  13. mapped_start: first base of remapped interval.
  14. mapped_stop: last base of remapped interval.
  15. mapped_strand: strand of remapped base.
  16. coverage: This is calculated by taking the ratio of the mapped_length to the source_length. If coverage =1 the remapped and source interval are identical. A coverage score of less than 1 indicates a deletion in the target assembly and a score of greater than 1 indicates an insertion in the target assembly.
  17. recip: Two possible values are in this column. First Pass means the remapping is based on the 'First Pass' or reciprocal-best-hit alignments. 'Second Pass' means the remapping is based on the non-reciprocal-best-hit alignments.
  18. asm_unit: The assembly unit to which the mapped_id belongs. For more information on assembly units, see:

Features that don't remap will have the word 'NOMAP' in column 15 and the reason for not mapping in column 16. The reasons are:

  • NOALIGN: There was no alignment for this region.
  • LOWCOV: The percent of the interval covered in the alignment was below the coverage threshold specified in the 'Remapping Options' (Minimum ratio of bases that must be remapped).
  • EXPANDED: The ratio of the length on the target sequence versus the length on the source sequence is greater than specified in the remap options (default is 2).
  • ALIGNGAP: The source interval falls entirely within an indel in an alignment between the source and target sequence.
  • NOTINSET: The source interval is not part of the alignment set.

Annotation Data: This file contains only the remapped features, in the format specified on the input page. No sample data is shown on the web page, but the file is available for download and display in your favorite viewer.

Genome Workbench Files: These are files that can be loaded directly into our client side viewer called Genome Workbench. They contain the sequence information for both the source and target assemblies, the assembly-assembly alignments used in the remapping and feature annotations (both the source features and the remap features). These files are available for download and are very useful for understanding how the alignments influenced the feature remapping (see Figure 4).

Example of a GBench file produced by Remap

Figure 4: View of remapping in Genome Workbench. The sequence being shown in this view is the Target assembly. The tracks are (in order from the top):

  • Ruler: showing basepair coordinates.
  • Sequence: for some organisms this will be colored and for others it will be grey. This track will show you the actual base pairs if you zoom in enough.
  • Tiling Path: Shows the INSDC sequences used to construct the sequence.
  • Genes Track: Gene annotation from NCBI annotation process.
  • Alignments: Alignment to the Source assembly. This will have the 'First Pass' alignments and the 'Second Pass' alignments if the 'Allow duplications' option was checked. The alignments are zoomed to the base pair level. Mismatches are colored in red. Insertions are shown using a blue triangle (none in this view).
  • SNP features: Variation features defined by dbSNP.
  • Only the remapped features are shown here. In this example features from dbVar were mapped from NCBI36->GRCh37.p9. Only remapped features are shown on the target assembly. If you open a sequence that is part of the Source assembly you can see the orginal features. 

Remapping Variation Data

Edits to VCF Files

If you are using a Variant Call Format (VCF) file as your input, you may find edits to REF and ALT bases in your remapped output under specific circumstances. The first is due to sequence differences in the source and target assemblies. The assembly to which a REF base refers differs in the input and output files. If you use a VCF file as your input, NCBI Remap will produce output annotation files in which REF bases refer to the sequence in the target assembly. This means that if a REF base differs between the source and target assemblies, the output VCF will report the target assembly base in the REF field. The corresponding ALT field in the output VCF will be updated, with the source assembly REF base replacing or being appended to the ALT base that was provided in the input VCF, as appropriate. The second circumstance is due to error in an input VCF. If the base specified in the REF column of an input VCF is incorrect, the correct base will be reported in the output VCF and the input base will be added to the ALT column.

Additionally, if you are using a Variant Call Format (VCF) file as your input, NCBI Remap will left-shift variants prior to remapping them. Upon remapping, it will left-shift again with respect to the target assembly. Therefore, when using VCF as your input, all output files will contain left-shifted coordinates. This ensures output VCF meets file specifications. If you provide VCF as your input and specify HGVS as your output, please note that the HGVS will also contain left-shifted coordinates. At this time, NCBI does not provide an equivalent right-shifting function for input HGVS files. This is planned for a future release.

If you have selected VCF as your output file type, all NCBI Remap edits to the REF and ALT fields are reported using INFO tags.

NCBI Remap VCF Meta-Information: NCBI Remap appends the following remapping-related information to the meta-information lines in the output VCF.

Meta-information Description
NCBI_remap_source_assm Assembly acc.ver of source assembly
NCBI_remap_target_assm Assembly acc.ver of target assembly
NCBI_remap_align_date Date on which alignments used for NCBI Remap were generated
NCBI_remap_run_date Date on which NCBI Remap was run by user
NCBI_remap_batch_id Alignment batch id (NCBI identifier)
NCBI_remap_align_parameters NCBI Remap alignment parameters

NCBI Remap VCF INFO fields: NCBI Remap also uses several INFO fields in the output VCF to describe feature updates that may have occurred during the remapping process. In addition, there is an INFO field indicating whether the feature was remapped with first- (reciprocal-best-hit) or second-pass (non-reciprocal-best-hit) alignments.

ID Number Type Description
REMAP_ALIGN 1 String Alignment type used for remapping (FP=first pass, SP=second pass)
REF_UPDATE 0 Flag REF and ALT bases modified due to difference in REF base in source and target assemblies
REF_ERROR 0 Flag REF base does not match source assembly
REF_LEFT_SHIFT 0 Flag Position of REF base left-shifted
Write to the Help Desk