Organism name:
Taeniopygia guttata (zebra finch)
Washington University Genome Sequencing Center
Assembly level:
Genome representation:
RefSeq category:
representative genome
GenBank assembly accession:
GCA_000151805.2 (latest)
RefSeq assembly accession:
GCF_000151805.1 (latest)
RefSeq assembly and GenBank assembly identical:
no (hide details)
  • Only in RefSeq: chromosome MT (in non-nuclear assembly-unit)
  • Data displayed for RefSeq version
WGS Project:
Assembly method:
PCAP v. 2008
Genome coverage:
Sequencing technology:

The zebra finch DNA for shotgun sequencing, and for BAC and cosmid libraries, was derived from a single male (Black 17) domesticated zebra finch from the laboratory of Arthur P. Arnold in the Department of Physiological Science at UCLA, ... Los Angeles, CA, USA. The parents of this male hatched in the same clutch in an aviary of group-housed zebra finches, and therefore may have been brother-sister. A male BAC library was constructed from the same bird by Barbara Blackmon at the Clemson University Genomics Institute (this library is NOT the same as the BAC library available from the Arizona Genome Institute made from several individual females). The initial assembly was generated using PCAP (Huang et al., 2006) from about 6X coverage in whole-genome shotgun reads, a combination of plasmid, fosmid and bacterial artificial chromosome (BAC)-end read pairs. The sequence of 35 finished BAC clones were incorporated into the final assembly. The T. guttata physical map contains 108,725 clones for about 10X depth of coverage and is contained in 2,724 contigs. 

Of the 1.2 Gb genome, 1.0Gb was ordered and oriented along 33 zebra finch chromosomes and 1 linkage group. An additional 36 Mb was localized to specific chromosomes or linkage groups, but was not ordered and oriented. For the initial PCAP assembly (prior to removal of contaminants and contigs/supercontigs less than 2kb), there were 92,299 major contigs (126,053 total contigs) with an N50 contig length of 39kb (n=8,037). There were 37,252 major supercontigs (37,698 total supercontigs) with the N50 supercontig length of 10.4Mb (n=29).

The zebra finch chromosomes were named based on their homologous chromosomes in Gallus gallus. For those chromosomes where multiple zebra finch chromosomes correspond to a single chicken chromosome, a letter was appended to the chromosome name. The lookup table can be found below for cross-referencing the Gallus gallus homologous names with the current naming convention for the zebra finch.

AGP Generation Details
To create chromosomal sequences, data from the Sheffield Linkage Map and the physical map were integrated with the WGS assembly data. Using sequence comparison, T. guttata SNP marker sequences were assigned to contigs (contiguous stretches of DNA) in the WGS assembly. Based on these marker assignments, the supercontigs (sets of ordered/oriented contigs linked by virtue of read pairing data) were assigned to a chromosome based on a majority rule (>50% of markers assigned to the same chromosome). The supercontigs were initially positioned along chromosomes based on their median marker position, and initially oriented based on relative marker order along the supercontig. The physical map was also linked to the sequence assembly by using BAC end sequence links and in silico digests of the assembly to create "ultracontigs", ordered/oriented lists of "supercontigs". Following these initial placements, the WGS assembly read pairing data were used, where possible, to aid in orientation and confirm order. For the Z chromosome, marker order was also determined by FISH (Art Arnold, personal commuication) and integrated again with the linkage map, physical map and assembly. All discrepancies betwen the various maps were manually reviewed and a combined super/ultracontig order was established based on reconciling the data from the Sheffield, assembly and physical maps. Available EST data were also used in reviewing the assembly. Alignments with the chicken genome were also examined and used as aid in orientation particularly when available other zebra finch-specific data were inconclusive. The location of the centromere is known only for the Z chromosome. Thus no other centromeres were placed in the current chromosomal assemblies.

Cross-reference of zebra finch chromosome names used for this release, chicken and finch chromosome name suggested by Itoh et al, 2005*. 

TGU GGA Itoh et al., 2005 Chromosome Res. 2005;13(1):47-56.
Chr1 1 3 
Chr1A 1 4 
Chr1B 1 NA 
Chr2 2 1 
Chr3 3 2 
Chr4 4 5 
Chr4A 4 microchromosome 
Chr5 5 6 
Chr6 6 7 
Chr7 7 8 
Chr8 8 9 
Chr9 9 10 
Chr10 10 NA 
Chr11 11 NA 
Chr12 12 NA 
Chr13 13 NA 
Chr14 14 NA 
Chr15 15 NA 
Chr16 16 NA 
Chr17 17 NA 
Chr18 18 NA 
Chr19 19 NA 
Chr20 20 NA 
Chr21 21 NA 
Chr22 22 NA 
Chr23 23 NA 
Chr24 24 NA 
Chr25 25 NA 
Chr26 26 NA 
Chr27 27 NA 
Chr28 28 NA 
LGE22 LGE22C19W28_E50C23 NA 
LGE22A LGE22C19W28_E50C23 NA 
ChrZ Z Z

DNA source - Art Arnold, Department of Physiological Science, UCLA 
Genome Sequence - The Genome Center, Washington University School of Medicine 
Sequence Assembly and Chromosomal Sequence Construction - The Genome Center, Washington University School of Medicine 
Zebra finch linkage map - Jessica Stapley, Tim Birkhead, Terry Burke and Jon Slate, Department of Animal & Plant Sciences, University of Sheffield, Sheffield, UK 
Z Map/FISH Mapping - Itoh Yuichiro and Art Arnold, Department of Physiological Science, UCLA 
Global statistics

Total sequence length1,232,135,591
Total assembly gap length9,270,900
Gaps between scaffolds326
Number of scaffolds37,422
Scaffold N508,236,790
Scaffold L5037
Number of contigs124,806
Contig N5038,639
Contig L508,016
Total number of chromosomes and plasmids36
Number of component sequences (WGS or clone)124,806

Global assembly definition

Assembly Unit Name
Primary Assembly
Assembly Unit: Primary Assembly (GCF_000151815.1)
Molecule nameGenBank sequenceRefSeq sequenceUnlocalized
sequences count
Chromosome 1CM000515.1=NC_011462.1169
Chromosome 1ACM000516.1=NC_011463.163
Chromosome 1BCM000517.1=NC_011464.16
Chromosome 2CM000518.1=NC_011465.1215
Chromosome 3CM000519.1=NC_011466.174
Chromosome 4CM000520.1=NC_011467.1109
Chromosome 4ACM000521.1=NC_011468.132
Chromosome 5CM000522.1=NC_011469.173
Chromosome 6CM000523.1=NC_011470.167
Chromosome 7CM000524.1=NC_011471.151
Chromosome 8CM000525.1=NC_011472.147
Chromosome 9CM000526.1=NC_011473.146
Chromosome 10CM000527.1=NC_011474.171
Chromosome 11CM000528.1=NC_011475.133
Chromosome 12CM000529.1=NC_011476.139
Chromosome 13CM000530.1=NC_011477.163
Chromosome 14CM000531.1=NC_011478.129
Chromosome 15CM000532.1=NC_011479.146
Chromosome 16CM000533.1=NC_011480.18
Chromosome 17CM000534.1=NC_011481.130
Chromosome 18CM000535.1=NC_011482.131
Chromosome 19CM000536.1=NC_011483.125
Chromosome 20CM000537.1=NC_011484.139
Chromosome 21CM000538.1=NC_011485.151
Chromosome 22CM000539.1=NC_011486.130
Chromosome 23CM000540.1=NC_011487.131
Chromosome 24CM000541.1=NC_011488.123
Chromosome 25CM000542.1=NC_011489.124
Chromosome 26CM000543.1=NC_011490.178
Chromosome 27CM000544.1=NC_011491.115
Chromosome 28CM000545.1=NC_011492.120
Chromosome LGE22CM000549.1=NC_011496.111
Chromosome LG2CM000547.1=NC_011494.10
Chromosome LG5CM000548.1=NC_011495.10
Chromosome ZCM000546.1=NC_011493.152

Assembly statistics

MoleculeSequence RoleTotal
AllAssembled molecule1,232,118,73837,4211,222,847,8388,236,79087,384326
Chromosome 1AllAssembled moleculeUnlocalized scaffolds119,725,188118,548,6961,176,49219122169119,122,688117,962,3961,160,29223,827,57223,827,5728,9506,0045,84216221210
Chromosome 1AAllAssembled moleculeUnlocalized scaffolds74,340,59373,657,157683,43679166373,989,99373,320,157669,83611,932,86611,932,86616,3053,4913,35513615150
Chromosome 1BAllAssembled moleculeUnlocalized scaffolds1,225,7771,083,483142,29411561,203,6771,064,283139,394183,949605,10978,48121718829440
Chromosome 2AllAssembled moleculeUnlocalized scaffolds158,185,007156,412,5331,772,47425641215157,431,607155,678,3331,753,27421,982,16421,982,16411,2397,4947,30219240400
Chromosome 3AllAssembled moleculeUnlocalized scaffolds113,988,967112,617,2851,371,682881474113,571,667112,226,3851,345,28217,903,99517,903,995145,7864,1603,89626413130
Chromosome 4AllAssembled moleculeUnlocalized scaffolds74,918,08469,780,3785,137,7061211210974,531,48469,451,8785,079,6069,773,88017,812,258567,2343,8553,27458111110
Chromosome 4AAllAssembled moleculeUnlocalized scaffolds20,959,68520,704,505255,1803753220,798,88520,547,105251,78013,709,13813,709,13812,6271,6041,57034440
Chromosome 5AllAssembled moleculeUnlocalized scaffolds64,885,75762,374,9622,510,79583107364,517,15762,026,8622,490,29556,928,22456,928,224823,3093,6773,472205990
Chromosome 6AllAssembled moleculeUnlocalized scaffolds38,395,92636,305,7822,090,14480136738,144,52636,074,3822,070,14410,521,94010,521,9401,234,3742,5022,30220012120
Chromosome 7AllAssembled moleculeUnlocalized scaffolds40,443,61539,844,632598,98367165140,207,01539,617,032589,98313,892,04013,892,04020,4172,3512,2619015150
Chromosome 8AllAssembled moleculeUnlocalized scaffolds33,102,45027,993,4275,109,0235584732,899,15027,836,7275,062,4236,216,8486,594,3973,401,1562,0261,560466770
Chromosome 9AllAssembled moleculeUnlocalized scaffolds27,606,41627,241,186365,2305374627,417,81627,058,286359,5309,760,4269,760,42610,5291,8801,82357660
Chromosome 10AllAssembled moleculeUnlocalized scaffolds21,358,80020,806,668552,13291207121,211,10020,668,268542,83217,364,59517,364,5959,3561,4581,3659319190
Chromosome 11AllAssembled moleculeUnlocalized scaffolds21,695,72521,403,021292,70443103321,564,22521,274,721289,5045,375,2365,375,23610,6321,3061,27432990
Chromosome 12AllAssembled moleculeUnlocalized scaffolds21,918,12221,576,510341,6124673921,777,22221,439,410337,8125,706,5715,706,57115,0781,4031,36538660
Chromosome 13AllAssembled moleculeUnlocalized scaffolds19,609,85916,962,3812,647,4786636319,435,85916,828,9812,606,8787,825,6188,523,521624,3841,7381,332406220
Chromosome 14AllAssembled moleculeUnlocalized scaffolds16,668,37616,419,078249,2983452916,542,77616,297,478245,2985,353,3315,353,33116,2321,2521,21240440
Chromosome 15AllAssembled moleculeUnlocalized scaffolds14,783,23514,428,146355,0895044614,634,63514,286,746347,8898,717,5268,717,5269,8861,4831,41172330
Chromosome 16AllAssembled moleculeUnlocalized scaffolds197,1629,909187,253918191,5629,509182,05351,7639,90951,76356452000
Chromosome 17AllAssembled moleculeUnlocalized scaffolds11,856,41711,648,728207,6893333011,733,61711,529,228204,3896,344,1846,344,18411,2841,2261,19333220
Chromosome 18AllAssembled moleculeUnlocalized scaffolds11,672,95511,201,131471,8243543111,545,45511,079,731465,7243,129,2353,129,235287,4771,2721,21161330
Chromosome 19AllAssembled moleculeUnlocalized scaffolds11,785,67711,587,733197,9442722511,671,87711,476,933194,9448,620,6348,620,6349,8191,1371,10730110
Chromosome 20AllAssembled moleculeUnlocalized scaffolds15,948,76615,652,063296,7034783915,788,36615,496,863291,50310,962,61810,962,61813,2621,5971,54552770
Chromosome 21AllAssembled moleculeUnlocalized scaffolds7,836,8805,979,1371,857,743565517,742,4805,913,5371,828,9431,450,2191,595,3991,450,219940652288440
Chromosome 22AllAssembled moleculeUnlocalized scaffolds4,171,0013,370,227800,774366304,101,0013,319,027781,974776,153776,15353,084695507188550
Chromosome 23AllAssembled moleculeUnlocalized scaffolds6,742,4126,196,912545,500387316,630,8126,098,312532,5001,956,2481,956,24882,7061,110980130660
Chromosome 24AllAssembled moleculeUnlocalized scaffolds8,205,3418,021,379183,962263238,085,7417,905,579180,1623,771,0363,771,03611,3481,1941,15638220
Chromosome 25AllAssembled moleculeUnlocalized scaffolds1,745,8841,275,379470,505317241,706,8841,247,679459,205407,827407,827101,413384271113660
Chromosome 26AllAssembled moleculeUnlocalized scaffolds6,527,3814,907,5411,619,840835786,415,7814,829,3411,586,4401,382,7941,442,663102,7891,112778334440
Chromosome 27AllAssembled moleculeUnlocalized scaffolds4,826,2444,618,897207,3475843154,738,4444,535,397203,047666,370666,37019,8198367934342420
Chromosome 28AllAssembled moleculeUnlocalized scaffolds5,161,0154,963,201197,814277205,068,9154,876,701192,2142,676,3742,676,37412,25291585956660
Chromosome LGE22AllAssembled moleculeUnlocalized scaffolds1,336,381883,365453,016187111,305,181866,065439,116254,912254,912273,445306167139660
Chromosome LG2Assembled molecule109,7413106,14199,049342
Chromosome LG5Assembled molecule16,416215,7168,81561
Chromosome ZAllAssembled moleculeUnlocalized scaffolds75,826,11872,861,3512,964,76782305274,946,71872,005,4512,941,2675,398,6075,398,6071,610,4543,7663,53123529290
unplacedAssembled molecule174,341,36535,359172,051,6655,94122,8970
Mitochondrion MT16,853
