The ARG.
A, Example ARG for four chromosome sequences. The sequences label the leaves of the ARG and are written as strings of 0s and 1s (coding SNP alleles). Moving backward in time (up the ARG), one first encounters a mutation. A mutation is denoted by a black dot and a number specifying its marker position. The second event is a recombination between markers 2 and 3. As one works backward in time, this corresponds to splitting a lineage into two, with the alleles at positions 1 and 2 following the left lineage and the allele at position 3 following the right lineage. After this is a coalescence, merging two lineages into one, and so on, to the grand common ancestor.
B, Marginal tree for the SNPs at positions 1 and 2.
C, Marginal tree for the SNP at position 3. To test a marginal tree for disease association, mutations are dropped onto each of the branches in turn, defining hypothetical allelic states of the leaves, which can then be tested for statistical association with the phenotype. The black dot labeled “2” best segregates the cases (D) from the controls (U) and would be identified as the most likely causative mutation event.
D–G, Logic behind the ARG inference algorithm.
D, The two sequences have a shared tract over the region

.
E, To coalesce over the tract region, we must add a recombination breakpoint to the right of it—that is, between positions 2 and 3. This results in two parent sequences.
F, We let undefined material (denoted by ·) coalesce with anything. We can now coalesce the left recombination parent and the other sequence.
G, We can add a mutation.