Sign in to NCBI

What is NCBI Remap?

Back to NCBI Remap Page

NCBI Remap is a tool that allows users to project annotation data from one coordinate system  to another. This remapping (sometimes called 'liftover') uses genomic alignments to project features from one sequence to the other. For each feature on the source sequence, we perform a base by base analysis of each feature on the source sequence in order to project the feature through the alignment to the new sequence.

We support three variations of Remap. Assembly-Assembly allows the remapping of features from one assembly to another. RefSeqGene allows for the remapping of features from assembly sequences to RefSeqGene sequences (including transcript and protein sequences annoted on the RefSeqGene) or from RefSeqGene sequences to an assembly. Alt loci remap allows for the mapping of features between the Primary assembly and the alternate loci and Patches available for GRC assemblies.

You can view a short video describing how to use remap here: http://www.youtube.com/watch?v=0lhcMGGReVQ

What's new

April 2014 Update

  • Added features:
    • Assembly accessions now provided as tool-tips in the target and source assembly drop-down menus. These provide users with unambiguous identifiers for the assemblies used in their remapping effort.
    • Assemblies in target and source menus are sorted by assembly and release version to facilitate identification of assembly of choice.
    • Construction of pre-configured URLs: users can now construct URLs with specified remapping parameters that can be bookmarked or used as links to NCBI Remap.
  • Bug fixes:
    • Remapped locations missing from VCF, GVF or HGVS file output (but present in report files).
    • Data remapped from alt-loci or patches to chromosomes reported on scaffolds rather than chromosomes.
    • Further correction to alt-loci remap

August 2013 Update

  • Added features:
    • Limited support for LRG sequences in Clinical Remap. We currently only support the current versions of LRG and RefSeqGene sequences. Support for older sequences will be added in a future update.
  • Bug fixes:
    • Inappropriate duplication of variants lines when using a VCF with multiple alterante alleles.
    • Dropping of scores from GTF files.
    • Alt locus remap was fixed.

November 2012 Update

  • Added features:
    • Alt locus remap: remap features between the primary assembly and the alternate loci/patches in GRC assemblies.
    • Clinical Remap: When you run this we will now make a call to the variation reporter and insert the results into Clincal Remap.
    • Added support for upload of compressed files. Currently GZip (.gz) and BZip2 (.bz) files are supported.
    • Improved HGVS nomenclature.

Specifying the data

Assembly-Assembly 

In order to use the NCBI Remap service, you must select the organism of interest, the assembly your features are on (Source Assembly) and the assembly on which you wish to project these features (Target Assembly). If you would like to request additional organisms or assemblies to be added to the list, please use the Write to the Help Desk to make this request.

List of supported assembly-assembly alignments in remap:

Organism Source Assembly Target Assembly Software version Last Updated
Cricetulus griseus Cgr1.0 CriGri_1.0 1.7 09/23/2014 22:15:26
Cricetulus griseus C_griseus_v1.0 CriGri_1.0 1.7 09/23/2014 22:18:29
Mus musculus MGSCv37 GRCm38 1.7 09/23/2014 14:14:17
Mus musculus MGSCv34 GRCm38 1.7 09/23/2014 14:20:37
Mus musculus MGSCv34 GRCm38.p2 1.7 09/23/2014 14:41:16
Mus musculus MGSCv3 MGSCv37 1.7 09/23/2014 14:44:47
Mus musculus MGSCv3 GRCm38 1.7 09/23/2014 14:52:36
Mus musculus Mm_Celera MGSCv37 1.7 09/23/2014 15:12:26
Mus musculus Mm_Celera MGSCv37 1.7 09/23/2014 15:26:36
Mus musculus Mm_Celera GRCm38 1.7 09/23/2014 15:35:52
Mus musculus Mm_Celera GRCm38 1.7 09/23/2014 15:47:02
Mus musculus MGSCv37 GRCm38.p2 1.7 09/23/2014 16:14:13
Mus musculus MGSCv36 GRCm38.p2 1.7 09/23/2014 16:14:13
Mus musculus MGSCv35 GRCm38.p2 1.7 09/23/2014 16:14:30
Mus musculus MGSCv3 GRCm38.p2 1.7 09/23/2014 16:20:33
Mus musculus GRCm38 GRCm38.p2 1.7 09/23/2014 16:21:35
Mus musculus GRCm38.p1 GRCm38.p2 1.7 09/23/2014 16:22:53
Mus musculus MGSCv36 GRCm38.p3 1.7 09/23/2014 16:44:44
Mus musculus mm129svJae1.0 GRCm38.p2 1.7 09/23/2014 17:01:43
Mus musculus MmusALLPATHS2 GRCm38.p2 1.7 09/23/2014 17:06:23
Mus musculus MGSCv35 GRCm38.p3 1.7 09/23/2014 18:09:04
Mus musculus GRCm38.p1 GRCm38.p3 1.7 09/23/2014 18:30:44
Mus musculus GRCm38 GRCm38.p3 1.7 09/23/2014 18:32:56
Mus musculus GRCm38.p2 GRCm38.p3 1.7 09/23/2014 18:40:12
Mus musculus mm129svJae1.0 GRCm38.p3 1.7 09/23/2014 18:54:50
Mus musculus MmusALLPATHS2 GRCm38.p3 1.7 09/23/2014 18:56:53
Mus musculus Mm_Celera GRCm38.p2 1.7 09/23/2014 18:58:20
Mus musculus MGSCv37 GRCm38.p3 1.7 09/23/2014 19:14:47
Mus musculus MGSCv34 GRCm38.p3 1.7 09/23/2014 19:17:48
Mus musculus Mm_Celera GRCm38.p2 1.7 09/23/2014 19:52:44
Mus musculus MGSCv3 GRCm38.p3 1.7 09/23/2014 20:17:22
Mus musculus Mm_Celera GRCm38.p3 1.7 09/23/2014 20:50:37
Mus musculus Mm_Celera GRCm38.p3 1.7 09/23/2014 21:08:19
Mus musculus Mm_Celera Mm_Celera 1.7 09/23/2014 22:44:22
Mus musculus MGSCv35 MGSCv37 1.7 09/23/2014 23:11:19
Mus musculus MGSCv35 GRCm38 1.7 09/23/2014 23:12:41
Mus musculus MGSCv34 MGSCv37 1.7 09/23/2014 23:18:07
Mus musculus MGSCv36 MGSCv37 1.7 09/23/2014 23:31:57
Mus musculus MGSCv36 GRCm38 1.7 09/23/2014 23:41:30
Rattus norvegicus Rn_Celera Rn_Celera 1.7 09/23/2014 18:09:44
Rattus norvegicus Rn_Celera Rnor_5.0 1.7 09/24/2014 01:27:08
Rattus norvegicus Rnor_5.0 Rnor_6.0 1.7 09/24/2014 01:42:49
Rattus norvegicus RGSC_v3.4 Rnor_5.0 1.7 09/24/2014 01:43:08
Rattus norvegicus Rn_Celera Rnor_5.0 1.7 09/24/2014 02:25:15
Rattus norvegicus Rn_Celera Rnor_6.0 1.7 09/24/2014 02:59:06
Rattus norvegicus Rn_Celera Rnor_6.0 1.7 09/24/2014 03:10:04
Rattus norvegicus RGSC_v3.4 Rnor_6.0 1.7 09/24/2014 03:50:18
Rattus norvegicus Rn_Celera RGSC_v3.4 1.7 09/24/2014 17:20:38
Rattus norvegicus Rn_Celera RGSC_v3.4 1.7 09/24/2014 17:38:38
Heterocephalus glaber HetGla_1.0 HetGla_female_1.0 1.7 09/24/2014 15:37:11
Vitis vinifera 8x_WGS 12X 1.7 09/23/2014 17:54:43
Cucumis sativus CSB10A_v1 CucSat_1.0 1.7 09/23/2014 06:23:38
Pan troglodytes verus CCYSCv1 Pan_troglodytes-2.1.4 1.7 09/26/2014 21:14:05
Arabidopsis thaliana TAIR9 TAIR10 1.7 09/20/2014 13:44:43
Arabidopsis thaliana TAIR8 TAIR10 1.7 09/20/2014 13:45:50
Arabidopsis thaliana TAIR7 TAIR10 1.7 10/03/2014 20:51:21
Oryza sativa Japonica Group IRGSP_3.0 Build 4.0 1.7 09/23/2014 17:52:34
Solanum lycopersicum SL2.40 SL2.50 1.7 11/14/2014 16:39:52
Solanum lycopersicum SL2.40 SL2.50 1.7 11/14/2014 17:13:39
Drosophila pseudoobscura pseudoobscura Dpse_2.0 Dpse_3.0 1.7 09/25/2014 08:56:36
Saccharomyces cerevisiae S288c SacCer_May2010 R64-1-1 1.7 10/10/2014 22:38:56
Saccharomyces cerevisiae W303 ASM29281v1 R64-1-1 1.7 10/11/2014 13:36:20
Hydra vulgaris h7 Hydra_RP_1.0 1.7 10/03/2014 22:23:24
Nomascus leucogenys Nleu1.0 Nleu_3.0 1.7 09/24/2014 01:03:29
Caenorhabditis elegans WS195 WBcel215 1.7 09/23/2014 05:42:56
Caenorhabditis elegans WBcel215 WBcel235 1.7 09/23/2014 05:43:05
Caenorhabditis elegans WS190 WBcel215 1.7 09/23/2014 05:43:17
Caenorhabditis elegans WS195 WBcel235 1.7 09/23/2014 05:44:04
Caenorhabditis elegans WS190 WBcel235 1.7 09/23/2014 05:44:40
Acyrthosiphon pisum Acyr_1.0 Acyr_2.0 1.7 09/23/2014 06:37:27
Drosophila melanogaster Release 6 plus MT Release 6 plus ISO1 MT 1.7 09/20/2014 13:46:46
Drosophila melanogaster Release 5 Release 6 plus ISO1 MT 1.7 09/20/2014 13:47:36
Drosophila melanogaster Release 5 Release 6 plus MT 1.7 09/20/2014 13:58:34
Drosophila simulans dsim_caf1 ASM75419v2 1.7 09/24/2014 15:21:06
Nasonia vitripennis Nvit_1.0 Nvit_2.0 1.7 09/23/2014 06:13:26
Nasonia vitripennis Nvit_1.0 Nvit_2.1 1.7 09/23/2014 06:18:29
Nasonia vitripennis Nvit_2.0 Nvit_2.1 1.7 09/23/2014 06:18:45
Nasonia giraulti Ngir_1.0 Nvit_2.1 1.7 09/23/2014 06:36:45
Nasonia longicornis Nlon_1.0 Nvit_2.1 1.7 09/23/2014 06:32:52
Apis mellifera Amel_2.0 Amel_4.5 1.7 09/23/2014 07:51:32
Apis mellifera Amel_4.0 Amel_4.5 1.7 09/23/2014 18:39:48
Strongylocentrotus purpuratus Spur_2.6 Spur_3.1 1.7 09/23/2014 19:34:19
Strongylocentrotus purpuratus Spur_v2.1 Spur_3.1 1.7 09/23/2014 19:34:21
Strongylocentrotus purpuratus Spur_0.5 Spur_3.1 1.7 09/23/2014 19:47:18
Ciona intestinalis KH KH 1.7 10/18/2014 05:40:24
Ciona intestinalis v1.0 KH 1.7 10/03/2014 21:03:35
Ciona intestinalis v1.0 KH 1.7 11/02/2014 10:24:19
Danio rerio Zv7 Zv9 1.7 09/23/2014 19:14:05
Danio rerio Zv8 Zv9 1.7 09/23/2014 19:18:31
Danio rerio Zv9 GRCz10 1.7 09/24/2014 22:24:19
Danio rerio Zv7 GRCz10 1.7 09/30/2014 15:31:13
Danio rerio Zv8 GRCz10 1.7 09/30/2014 15:50:20
Oreochromis niloticus Orenil1.0 Orenil1.1 1.7 09/23/2014 18:51:23
Xenopus (Silurana) tropicalis v4.2 Xtropicalis_v7 1.7 09/23/2014 20:03:33
Chrysemys picta bellii Chrysemys_picta_bellii-3.0.1 Chrysemys_picta_bellii-3.0.3 1.7 09/23/2014 20:23:36
Gallus gallus Gallus_gallus-2.1 Gallus_gallus-4.0 1.7 09/23/2014 19:43:22
Macaca fascicularis CE_1.0 Macaca_fascicularis_5.0 1.7 10/18/2014 03:27:43
Macaca fascicularis MacFas_Jun2011 Macaca_fascicularis_5.0 1.7 10/18/2014 23:24:51
Macaca mulatta CR_1.0 Mmul_051212 1.7 10/18/2014 09:48:15
Pan troglodytes Pan_troglodytes-2.1.3 Pan_troglodytes-2.1.4 1.7 09/23/2014 19:31:37
Pan troglodytes Pan_troglodytes-2.1.3 Pan_troglodytes-2.1.4 1.7 09/23/2014 19:43:51
Pan troglodytes Pan_troglodytes-2.1.4 Pan_troglodytes-2.1.4 1.7 09/24/2014 02:08:40
Pan troglodytes Pan_troglodytes-2.1 Pan_troglodytes-2.1.4 1.7 09/24/2014 05:07:19
Homo sapiens NCBI34 GRCh38.p1 1.7 12/15/2014 22:44:42
Homo sapiens NCBI35 GRCh38.p1 1.7 12/15/2014 22:48:04
Homo sapiens NCBI36 GRCh38.p1 1.7 12/15/2014 22:51:24
Homo sapiens GRCh37.p2 GRCh38.p1 1.7 12/15/2014 22:52:22
Homo sapiens GRCh37.p12 GRCh38.p1 1.7 12/15/2014 23:03:11
Homo sapiens GRCh37.p10 GRCh38.p1 1.7 12/15/2014 23:03:13
Homo sapiens NCBI33 GRCh38.p1 1.7 12/15/2014 23:03:36
Homo sapiens GRCh37.p9 GRCh38.p1 1.7 12/15/2014 23:04:06
Homo sapiens GRCh37.p13 GRCh38.p1 1.7 12/15/2014 23:04:22
Homo sapiens GRCh37.p11 GRCh38.p1 1.7 12/15/2014 23:05:25
Homo sapiens GRCh37 GRCh38.p1 1.7 12/15/2014 23:07:05
Homo sapiens GRCh37.p5 GRCh38.p1 1.7 12/15/2014 23:17:53
Homo sapiens CRA_TCAGchr7v2 GRCh38 1.7 09/20/2014 14:40:01
Homo sapiens NCBI33 GRCh38 1.7 09/20/2014 15:19:24
Homo sapiens NCBI34 GRCh38 1.7 09/20/2014 15:23:51
Homo sapiens NCBI35 GRCh38 1.7 09/20/2014 15:25:43
Homo sapiens GRCh37 GRCh38 1.7 09/20/2014 15:29:12
Homo sapiens GRCh37.p2 GRCh38 1.7 09/20/2014 15:33:05
Homo sapiens NCBI36 GRCh38 1.7 09/20/2014 15:33:15
Homo sapiens GRCh37.p11 GRCh38 1.7 09/20/2014 15:34:07
Homo sapiens GRCh37.p10 GRCh38 1.7 09/20/2014 15:34:42
Homo sapiens GRCh37.p13 GRCh38 1.7 09/20/2014 15:35:18
Homo sapiens GRCh37.p12 GRCh38 1.7 09/20/2014 15:35:24
Homo sapiens GRCh37.p5 GRCh38 1.7 09/20/2014 15:37:20
Homo sapiens GRCh37.p9 GRCh38 1.7 09/20/2014 15:41:29
Homo sapiens HuRef GRCh38 1.7 09/20/2014 16:15:10
Homo sapiens YH_2.0 GRCh38 1.7 09/20/2014 16:16:31
Homo sapiens CHM1_1.0 GRCh38 1.7 09/20/2014 16:17:59
Homo sapiens CHM1_1.1 GRCh38 1.7 09/20/2014 20:18:56
Homo sapiens NCBI34 NCBI36 1.7 09/23/2014 08:54:02
Homo sapiens NCBI33 GRCh37.p10 1.7 09/23/2014 09:57:13
Homo sapiens GRCh37.p9 GRCh37.p10 1.7 09/23/2014 10:06:30
Homo sapiens HuRef GRCh37.p10 1.7 09/23/2014 11:00:35
Homo sapiens CHM1_1.0 GRCh37.p10 1.7 09/23/2014 11:01:56
Homo sapiens CHM1_1.0 CHM1_1.1 1.7 09/23/2014 11:13:15
Homo sapiens HuRef GRCh37.p13 1.7 09/23/2014 11:18:32
Homo sapiens CHM1_1.0 GRCh37.p13 1.7 09/23/2014 11:20:04
Homo sapiens YH_2.0 GRCh37.p13 1.7 09/23/2014 11:36:12
Homo sapiens CHM1_1.1 GRCh37.p10 1.7 09/23/2014 14:04:46
Homo sapiens CHM1_1.1 GRCh37 1.7 09/23/2014 15:33:08
Homo sapiens CHM1_1.1 GRCh37.p13 1.7 09/23/2014 16:37:40
Homo sapiens HuRefPrime HuRef 1.7 09/23/2014 19:27:09
Homo sapiens NCBI33 NCBI35 1.7 09/23/2014 19:49:41
Homo sapiens NCBI34 NCBI35 1.7 09/23/2014 19:49:46
Homo sapiens NCBI35 NCBI36 1.7 09/23/2014 20:06:50
Homo sapiens NCBI33 NCBI36 1.7 09/23/2014 20:17:17
Homo sapiens NCBI36 GRCh37 1.7 09/23/2014 20:18:46
Homo sapiens NCBI35 GRCh37 1.7 09/23/2014 20:21:23
Homo sapiens NCBI33 GRCh37 1.7 09/23/2014 20:26:42
Homo sapiens NCBI34 GRCh37 1.7 09/23/2014 20:42:21
Homo sapiens NCBI35 GRCh37.p10 1.7 09/23/2014 20:48:16
Homo sapiens NCBI36 GRCh37.p13 1.7 09/23/2014 20:54:41
Homo sapiens NCBI34 GRCh37.p10 1.7 09/23/2014 20:56:48
Homo sapiens NCBI35 GRCh37.p13 1.7 09/23/2014 20:58:24
Homo sapiens NCBI36 GRCh37.p10 1.7 09/23/2014 20:58:57
Homo sapiens NCBI34 GRCh37.p13 1.7 09/23/2014 21:01:38
Homo sapiens NCBI33 GRCh37.p13 1.7 09/23/2014 21:07:57
Homo sapiens YH_2.0 GRCh37 1.7 09/23/2014 21:16:06
Homo sapiens GRCh37.p10 GRCh37.p13 1.7 09/23/2014 21:19:03
Homo sapiens CHM1_1.0 GRCh37 1.7 09/23/2014 21:31:22
Homo sapiens HuRef GRCh37 1.7 09/23/2014 21:40:49
Homo sapiens CHM1_1.0 HuRef 1.7 09/23/2014 22:14:46
Homo sapiens CHM1_1.1 HuRef 1.7 09/23/2014 22:21:20
Homo sapiens NCBI34 NCBI34 1.7 09/24/2014 15:16:42
Homo sapiens NCBI33 NCBI34 1.7 09/24/2014 15:28:50
Homo sapiens CRA_TCAGchr7v2 NCBI34 1.7 10/03/2014 20:03:27
Homo sapiens CRA_TCAGchr7v2 GRCh37 1.7 10/03/2014 20:43:37
Homo sapiens CRA_TCAGchr7v2 GRCh37.p10 1.7 10/03/2014 20:43:44
Homo sapiens CRA_TCAGchr7v2 GRCh37.p13 1.7 10/03/2014 20:45:43
Homo sapiens CRA_TCAGchr7v2 NCBI35 1.7 10/03/2014 20:46:11
Homo sapiens CRA_TCAGchr7v2 NCBI36 1.7 10/03/2014 21:02:31
Homo sapiens CRA_TCAGchr7v2 HuRef 1.7 10/03/2014 21:36:54
Homo sapiens CRA_TCAGchr7v2 CHM1_1.1 1.7 10/03/2014 21:43:23
Homo sapiens Hs_Celera GRCh37 1.7 10/03/2014 22:08:04
Homo sapiens Hs_Celera GRCh37.p13 1.7 10/03/2014 22:13:56
Homo sapiens Hs_Celera GRCh38 1.7 10/03/2014 23:22:06
Homo sapiens HuRef GRCh38.p1 1.7 11/25/2014 19:28:16
Homo sapiens GRCh38 GRCh38.p1 1.7 12/01/2014 12:54:56
Homo sapiens GRCh37 GRCh38.p2 1.7 12/17/2014 22:16:53
Homo sapiens NCBI34 GRCh38.p2 1.7 12/17/2014 22:17:16
Homo sapiens NCBI33 GRCh38.p2 1.7 12/17/2014 22:17:42
Homo sapiens GRCh37.p5 GRCh38.p2 1.7 12/17/2014 22:24:39
Homo sapiens GRCh37.p12 GRCh38.p2 1.7 12/17/2014 22:24:39
Homo sapiens GRCh37.p13 GRCh38.p2 1.7 12/17/2014 22:26:51
Homo sapiens GRCh37.p10 GRCh38.p2 1.7 12/17/2014 22:34:33
Homo sapiens NCBI35 GRCh38.p2 1.7 12/17/2014 22:36:17
Homo sapiens GRCh37.p11 GRCh38.p2 1.7 12/17/2014 22:38:19
Homo sapiens GRCh37.p9 GRCh38.p2 1.7 12/17/2014 22:43:58
Homo sapiens GRCh37.p2 GRCh38.p2 1.7 12/17/2014 22:44:10
Homo sapiens HuRef GRCh38.p2 1.7 12/17/2014 23:04:22
Homo sapiens NCBI36 GRCh38.p2 1.7 12/17/2014 23:24:14
Homo sapiens CHM1_1.1 GRCh38.p2 1.7 12/18/2014 03:33:06
Canis lupus familiaris CanFam2.0 CanFam3.1 1.7 09/23/2014 21:21:03
Mustela putorius furo MusPutFurMale1.0 MusPutFur1.0 1.7 09/23/2014 10:39:55
Felis catus catChrV17e Felis_catus-6.2 1.7 09/24/2014 07:07:56
Sus scrofa Sscrofa10 Sscrofa10.2 1.7 09/23/2014 11:43:57
Sus scrofa Sscrofa5 Sscrofa10 1.7 09/23/2014 12:09:57
Sus scrofa Sscrofa5 Sscrofa10.2 1.7 09/23/2014 12:13:13
Sus scrofa Sscrofa9.2 Sscrofa10.2 1.7 09/23/2014 12:55:39
Sus scrofa Sscrofa9.2 Sscrofa10 1.7 09/23/2014 14:13:24
Bos taurus Btau_3.1 Btau_4.2 1.7 09/23/2014 12:34:56
Bos taurus Btau_4.2 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 12:49:47
Bos taurus Btau_4.0 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 12:59:33
Bos taurus Btau_4.6.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 13:00:02
Bos taurus Btau_3.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 13:00:45
Bos taurus Bos_taurus_UMD_3.1 Bos_taurus_UMD_3.1.1 1.7 09/23/2014 21:28:03
Bos taurus Btau_4.0 Btau_4.2 1.7 09/23/2014 21:56:15
Bos taurus Btau_4.2 Btau_4.6.1 1.7 09/23/2014 22:30:39
Bos taurus Btau_4.0 Btau_4.6.1 1.7 09/23/2014 22:33:17
Bos taurus Btau_3.1 Btau_4.6.1 1.7 09/23/2014 22:48:23
Bos taurus Btau_4.0 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:11:23
Bos taurus Btau_4.2 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:27:24
Bos taurus Btau_4.6.1 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:28:30
Bos taurus Btau_3.1 Bos_taurus_UMD_3.1 1.7 09/25/2014 10:54:29

Clinical Remap

Only human is supported for the RefSeqGene tab, so all that is needed is for you to select the sequence upon which your features are annotated (either an assembly or RefSeqGenes) and the sequences to which you want the features mapped (either RefSeqGenes or an assembly). 

Alt loci remap

Alt loci remap allows you to map data between the Primary Assembly and the Alternate Loci/Patches that may be available for an assembly. Only assemblies produced by the Genome Reference Consortium are supported on this page. All you need to select on this page is the organism and the assembly, the software will figure out the direction in which you want to map. 

NOTE: For both Clinical Remap and Alt loci remap if you map FROM an assembly to either the RefSeqGenes or the Alternate Loci/Patches, you may have a lot of failed features as both of these sequences only cover a fraction of the genome. To see genome coverage for Alternate Loci/Patches see the GRC pages for human and mouse.

Remapping Options

Some configuration options are available that will allow you to configure the stringency of remapping. This options are only configurable in the Assembly-Assembly tab.

  1. Minimum ratio of bases that must be remapped (default: 0.5): This option specifies the percentage of the interval that must be able to be remapped. Raising this value increases the stringency of the remapping process.
  2. Maximum ratio for difference between the source length and the target length (default 2.0): This feature allows the remapping algorithm to tolerate insertions and deletions in the alignment. This is calculated by taking the interval length on the target assembly (stop-start+1) and dividing it by the interval length on the source assembly (stop-start+1). An insertion or deletion in the target assembly will affect this ratio. Lowering this value will increase the stringency of the remapping process.
  3. Allow multiple locations to be returned (default: on): We perform alignments in two phases (see 'About our alignments'). Selecting this option will allow the 'Second Pass' alignments to be used and improve coordinate projection in regions of duplication. This can also lead to multiple features being remapped to the same location.
  4. Merge Fragments (default: on): An insertion in the target assembly will split a feature on the source assembly, selecting this option will merge these two locations into a single location in the annotation file. Turning this feature off will increase the stringency of the remapping process, specifically in cases where there is an insertion in the target sequence as each remapped interval will be compared to the original interval.

The merge function can help you remap features that cross an assembly gap, or have a large insertion that causes a gap in the alignment.

Example of a feature crossing a gap

Figure 1: A region with a feature that crosses an assembly gap. This feature was successfully remapped because the merge function was on.

However, in regions with messy alignments, the merge function can cause a feature to be remapped to the same, or overlapping positions. This only happens when using the Second Pass alignments for remapping as these alignments are not guaranteed to bee unique.

Region with complicated alignments in the second pass.

Figure 2: A region with nice First Pass alignments and many Second Pass alignments. 

Using the merge function, this feature remaps to six locations in GRCh37, one using the First Pass alignments and five using the Second Pass. These are easily distinguished using the remap report as the 'recip' column specifies whether the first pass or second pass alignments were used.

remap report for feature in region with complicated second pass alignments.

Figure 3: Remap report for feature with multiple locations returned due to complicated second pass alignments.

These features are relatively easy to identify in a post processing step, or you can turn the merge function off. This will, however, negatively affect features that cross a gap. You may need to review the alignments (which you can do using the Genome Workbench project files) to determine the best course of action.

Note: Alignments are processed in a strand specific manner. If a feature aligns to a region for which there are alignments on both strands, you may get a placement returned for the plus and the minus strand. Using the merge feature may increase the chances of this as merge helps to span alignment gaps. Turning merge off will cause a decrease in remapped features as gaps will not be crossed on either strand.

Configuring Remapping Parameters via URLs

There are several parameters that can be added to the NCBI Remap URL to pre-configure the mapping parameters that will be used.

Parameter Definition Example
tab remapping category (allowed values: asm|rsg|alt-loci). Specifies the type of remapping to be done: assembly-assembly, clinical or alt loci. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&in_fmt=bed&out_fmt=bed
src_org Organism. Allowed values: scientific (binomial) name of organism. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Apis mellifera
src_asm Source assembly. Allowed value: assembly accession.version. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Apis mellifera&src_asm=GCF_000002195.1
tgt_asm Target assembly. Allowed value: assembly accession.version. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Apis mellifera&src_asm=GCF_000002195.1&tgt_asm=GCF_000002195.4
min_ratio Minimum ratio of bases that must be remapped. Allowed value: positive number. If not provided, defaults to 0.5. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5
max_ratio Maximum ratio for difference between source length and target length. Allowed value: positive number. If not provided, defaults to 2.0. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&max_ratio=2.0
allow_locations Allow multiple locations to be returned. Allowed values: true|false. If not provided, defaults to true. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&allow_locations=true
merge_fragments Merge split features into a single location in the annotation file. Allowed values: true|false. If not provided, defaults to true. See "Remapping Options" for more information. Only valid for tab=asm|alt-loci. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&merge_fragments=true
data_from Identifies source of data to be remapped. Allowed values: assembly accession.version|LRG|RefSeqGene. Only valid for tab=rsg. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&data_from=RefSeqGene
data_to Identifies target of data to be remapped. Allowed values: assembly accession.version|LRG|RefSeqGene. Only valid for tab=rsg. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&data_to=GCF_000001405.26
any_refseq Set of RefSeqs or LRGs to which data should be remapped (any or user-specified). Allowed values: true|false. Only valid if tab=rsg and data_from=acc.ver. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&data_from=GCF_000001405.25&data_to=RefSeqGene&any_refseq=true
without_refseq Provide remapped locations on NMs/NPs even if there is no RefSeqGene or LRG. Allowed values: true|false. Only valid if tab=rsg and data_from=acc.ver. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&data_from=GCF_000001405.26&data_to=RefSeqGene&without_refseq=true
with_refseq Provide remapped locations on NMs/NPs associated with RefSeqGenes or LRGs. Allowed values: true|false. Only valid if tab=rsg. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=rsg&data_from=GCF_000001405.26&data_to=RefSeqGene&with_refseq=true
in_fmt Format of input file. Allowed values: guess|hgvs|bed|gff|gff3|gvf|gtf|vcf|asnt|asnb|region. If not provided, defaults to guess. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5&max_ratio=2.0&allow_locations=true&merge_fragments=true&in_fmt=bed
out_fmt Format of output annotation file. Allowed values: guess|hgvs|bed|gff|gff3|gvf|gtf|vcf|asnt|asnb|region. If not provided, defaults to guess. http://www.ncbi.nlm.nih.gov/genome/tools/remap/#tab=asm&src_org=Homo sapiens&src_asm=GCF_000001405.13&tgt_asm=GCF_000001405.26&min_ratio=0.5&max_ratio=2.0&allow_locations=true&merge_fragments=true&in_fmt=region&out_fmt=region

Providing Data

We accept file formats that are commonly used in the bioinformatics community. We currently accept:


The default behavior is to provide the remapped annotation file in the same format as the input file, but you can specify a different format for the output.
If you have a small amount of data, you can just copy and paste the data in the large text box labeled 'Paste data here'. Otherwise, you can just upload the data file.
Please note: the larger your file is, the longer it will take to perform the remapping process. If you find that the process is taking a very long time, or failing, you may want to split your files into smaller ones, perhaps based on chromosome assignment. There is also an absolute limit on the amount of RAM available to the system. If this is exceeded, Remap will fail. If this happens try again with a smaller file. 

You may also provide data in the text box provided. In addition to the formats described above, you can put a region into the text box. For example:

chr1:10349-25000

Clinical Remap tab only Data options

Mapping from a RefSeqGene(s) to an assembly: In this case, an additional option is provided (checked by default). This will allow the service to return features on both the genomic sequences as well as any transcripts (NMs) or proteins (NPs) available at that locus.

Mapping from an assembly to RefSeqGenes: In this case, you have the ability to map to any available RefSeqGene (default) or you can specify a list of RefSeqGenes as targets. If you select to map to any available RefSeqGene there are two additional options for providing locations on transcripts (NMs) or proteins. One is to provide the transcript (NM) and protein (NP) locations for features that map to RefSeqGenes and the other is to provide transcript (NM) and protein (NP) locations even if there isn't a RefSeqGene where your feature maps. Not all genes in the genome have a RefSeqGene. There is a link on the page that allows you to request the construction of a RefSeqGene if one is not available for your gene of interest.

Output files

Summary Data: This is a global report to provide an overview of remapping results. The format of the report is (by column):

  • ID: The sequence ID in the source assembly (often something like 'chr1' or NC_000001.9).
  • Source Features: The number of features on the ID in the source file.
  • Remapped Features: The number of features that could be projected onto the Target assembly.
  • Source Intervals: The number of intervals on the ID in the source file. This happens because some features will have more than one sequence interval, for example, mRNA features will often have multiple intervals (corresponding to exons).
  • Remapped Intervals: The number of intervals that could be projected onto the Target assembly.

The summary data appears on the web page and is available for download.

Mapping Report: This is a report that provides a feature by feature breakdown of the remapping status. The format of this report on the web page is (by column):

  • Feature: The name or ID of the feature (the source of this will depend on the format submitted, but it should be possible to robustly associate the information in this column with the data in the input file).
  • Src. Intervals: Number of intervals the feature has in the source file.
  • Remap Intervals: Number of intervals that were projected to the target assembly.
  • Src location: The feature location in the input file.
  • Src length: The length of the feature in the input file.
  • Map Location: Projected location (or reason that the remap failed) on the target assembly.
  • Map length: Length of the feature on the target assembly.
  • Coverage: Coverage of feature on the target assembly.

Only a few lines of this report are displayed on the web page, but the entire report is available for download in a tab separated file (tsv) that can be easily parsed, or loaded to spreadsheet program. The downloaded report has 18 columns as follows:

  1. #feat_name: user supplied feature name. If no feature name is supplied, a name is calculated using the line number in the file or the location.
  2. source_int: The number of intervals in the source file (useful for tracking features with multiple intervals, like genes).
  3. mapped_int: the number of intervals in the remapped file.
  4. source_id: sequence identifier the feature maps to in the source file.
  5. mapped_id: sequence identifier the features maps to on the target assembly.
  6. source_length: length of the feature on the source assembly.
  7. mapped_length: length of the feature on the target assembly.
  8. source_start: first base of the feature on the source assembly.
  9. source_stop: last base of the feature on the source assembly.
  10. source_strand: strand the feature is annotated on in the source assembly.
  11. source_sub_start: first base of sub interval on the source assembly (i.e. an exon feature).
  12. source_sub_start: last base of sub interval on the source assembly (i.e. an exon feature).
  13. mapped_start: first base of remapped interval.
  14. mapped_stop: last base of remapped interval.
  15. mapped_strand: strand of remapped base.
  16. coverage: This is calculated by taking the ratio of the mapped_length to the source_length. If coverage =1 the remapped and source interval are identical. A coverage score of less than 1 indicates a deletion in the target assembly and a score of greater than 1 indicates an insertion in the target assembly.
  17. recip: Two possible values are in this column. First Pass means the remapping is based on the 'First Pass' or reciprocal best hit alignments. 'Second Pass' means the remapping is based on the non-reciprocal best hit alignments.
  18. asm_unit: The assembly unit to which the mapped_id belongs. For more information on assembly units, see: http://www.ncbi.nlm.nih.gov/projects/genome/assembly/model.shtml

Features that don't remap will have the word 'NOMAP' in column 15 and the reason for not mapping in column 16. The reasons are:

  • NOALIGN: There was no alignment for this region.
  • LOWCOV: The percent of the interval covered in the alignment was below the coverage threshold specified in the 'Remapping Options' (Minimum ratio of bases that must be remapped).
  • EXPANDED: The ratio of the length on the target sequence versus the length on the source sequence is greater than specified in the remap options (default is 2).

Clinical Remap Only Output:

When you run Cliincal Remap, we will make a call to the Variation Reporter to provide an analysis of your variant data. We then inject the report produced by the Variation Reporter into the Remap output. For more information on the Variation Reporter report, so the help page. 

Annotation Data: This file contains only the remapped features, in the format specified on the input page. No sample data is shown on the web page, but the file is available for download and display in your favorite viewer.

Genome Workbench Files: These are files that can be loaded directly into our client side viewer called Genome Workbench. They contain the sequence information for both the source and target assemblies, the assembly-assembly alignments used in the remapping and feature annotations (both the source features and the remap features). These files are available for download and are very useful for understanding how the alignments influenced the feature remapping (see Figure 4).

Example of a GBench file produced by Remap

Figure 4: View of remapping in genome workbench. The sequence being shown in this view is the Target assembly. The tracks are (in order from the top):

  • Ruler: showing basepair coordinates.
  • Sequence: for some organisms this will be colored and for others it will be grey. This track will show you the actual base pairs if you zoom in enough.
  • Tiling Path: Shows the INSDC sequences used to construct the sequence.
  • Genes Track: Gene annotation from NCBI annotation process.
  • Alignments: Alignment to the Source assembly. This will have the 'First Pass' alignments and the 'Second Pass' alignments if the 'Allow duplications' option was checked. The alignments are zoomed to the base pair level. Mismatches are colored in red. Insertions are shown using a blue triangle (none in this view).
  • SNP features: Variation features defined by dbSNP.
  • Only the remapped features are shown here. In this example features from dbVar were mapped from NCBI36->GRCh37.p9. Only remapped features are shown on the target assembly. If you open a sequence that is part of the Source assembly you can see the orginal features. 
Write to the Help Desk