Assembly strategy. Effects of separate assembly of diverged homologs by a single-copy assembler such as phrap. (a) Hypothetical configuration of genomic sequence. Two diverged homologous regions are shown in pink and brown, flanked by nearly homozygous sequence shown in blue. Reads containing pink sequence look different from brown reads and must not assemble into the same contig. In the blue regions, reads from either homolog look alike and be assembled together. (b and c) The two possible ways in which these conditions can be met by the assembler. In both cases, two contigs are produced, one containing pink reads and the other, brown. In b, the two blue flanking regions assemble into different contigs. The first contig contains a small amount of blue sequence on the right because of reads that are mostly pink but extend into the blue region. The second similarly contains a small amount of blue sequence (on the left). In c, both blue flanking regions are assembled into the contig containing the pink homolog. The second contig consists only of the brown homolog plus a small amount of blue sequence, as described for b. In both cases, the phrap contig numbers x, y, z, and w are arbitrary, and the separated homologs must be located by sequence alignment. In b, it is predicted that the alignment will extend to the right end of contig x and the left end of y. In c, the alignment will include both ends of contig w, running the entire length of the contig. We call such alignments terminal.