| column | content | description |
| 1 | object | This is the identifier for the object being assembled. This can be a chromosome, contig or supercontig (scaffold). If the object is a chromosome: the naming convention is the chromosome number preceded by the letters chr. Ex: chr1 If the object is a contig or supercontig, the identifier needs to be unique within the assembly. |
| 2 | object_beg | The starting coordinates on the object in column 1. |
| 3 | object_end | The ending coordinates on the object in column 1. |
| 4 | part_number | The line count for the components that make up the object described in column 1. All components (sequence and gaps) are counted. |
| 5 | component_type | The sequence status of the component. Current values: A=Active Finishing D=Draft HTG F=Finished HTG G=Whole Genome Finishing P=Pre Draft N=gap O=Other sequence W=WGS contig The value in this field will determine the values of items in some of the remaining fields. Gap lines (N) have a different structure than sequence component lines. |
| 6a | component_id | If column 5 not equal to N: This is a unique identifier for the sequence component contributing to the object described in column 1. If the components have been submitted to a public repository (GenBank/EMBL/DDBJ) this value should be the accession.version of the component. Otherwise it should be an identifier that is unique within the assembly. |
| 6b | gap_length | If column 5 equal to N: This column represents the length of the gap. |
| 7a | component_start | If column 5 not equal to N: This column specifies the beginning of the component sequence that contributes to the object in column 1. (in component coordinates) |
| 7b | gap_type | If column 5 equal to N: This column specifies the gap type. Fundamentally, there are two types of gaps, captured and uncaptured. In some cases, uncaptured gaps are assigned biological value (i.e. centromere). |
| 8a | component_end | If column 5 not equal to N: This column specifies the end of the part of the component that contributes to the object in column 1. (in component coordinates) |
| 8b | linkage | If column 5 equal to N: This column indicates if there is evidence of linkage between the adjacent lines. Values: |
| 9a | orientation | If column 5 not equal to N: This column specifies the orientation of the component relative to the object in column 1. Values: |
Extended comments:
1. Columns should be tab delimited. Lines end with a new line (\n). There should be no
extra space around the individual tokens.
2. All coordinates given in the file are 1 based inclusive (not 0 based). i.e. the first base
of an object is 1 (not 0).
3. Evidence of linkage. In general, evidence of linkage is provided by end pairs
(sometimes referred to as mate pairs). Although, other evidence could be used (such as
transcript alignments). In some cases, evidence of linkage may be indirect. For example,
given the following supercontig:
A---B---C----D
Where A,B,C, and D are components, there could be end pairs linking A and B
and end pairs linking A and C. There might be no pairs linking B and C, but
their linkage is implied.
4. If the object is a contig or supercontig, the object should not end with a gap line.
5. Coordinates are all with respect to the plus strand, no matter the orientation of the
component.
6. object_beg should always be less than or equal to object_end.
7. component_beg should always be less than or equal to component_end
8. Any text after a # symbol is assumed to be a comment