Multiple alignments of conserved proteins that define the cytoplasmic DNA virus clade. (A) D5R-like helicases. With the PBCV ATPase as the seed, the ESV ortholog and many phage primases were recovered with highly significant Expectation (E) values in the first iteration. Proteins from the other NCLDV and the distantly related papillomavirus, parvovirus, and positive-strand RNA viruses were recovered in the second and third iterations with E-values of <10−3. For example, ASFV C962R was recovered with an E-value of 10−8 in the third iteration. Further transitive searches identified all of the members of superfamily III helicase. (B) A32L-like ATPases. With the PBCV ATPase as the seed, iridoviral orthologs were recovered in the first iteration with an E-value of <10−5. Orthologs from all other NCLDV were recovered by the third iteration with significant E-values such as 3 × 10−19 for MCV and 2 × 10−04 for ASFV orthologs. (C) A1L-like transcription factors. A profile made with previously detected FCS domains from the polyhomeotic and FIM families of proteins, when run against the NCLDV protein sets, with an inclusion cutoff of 0.01, recovered all members of this family; VV A1L, for example, was recovered with an E-value of 10−4. (D) D13L-like capsid proteins. With p50 of the Spodoptera exigua ascovirus as the seed, the PBCV and other iridoviral capsid proteins were recovered with E-values of <2 × 10−8. The ASFV ortholog was detected in the third iteration with an E-value of 3 × 10−3, and the poxviral D13L-like proteins were recovered at borderline E-values (0.14) in the fourth iteration. When a profile made from the alignment of the PBCV, iridovirus, and ASFV sequences was run against a database of all NCLDV proteins, the poxviral orthologs were detected as top hits, with E-values of <10−5. The probability of the conserved motifs shown here to occur in these proteins by chance was <10−15, as computed by using the MACAW program (49). (E) L1R/F9L-like virion membrane proteins. With CIV 048L as the seed, the ASFV and PBCV orthologs were recovered in the second iteration, with E-values of 8 × 10−4 and 10−3, respectively. The entomopoxviral orthologs were detected in the third iteration with an E-value of 2 × 10−4. A transitive search with the entomopoxviral proteins recovered the other poxviral proteins with E-values of <10−3. Each protein is denoted by the corresponding gene name followed by species abbreviation and the GenBank Identifier (GI) number. The numbers preceding and following the alignments indicate the positions of the first and last residues of the aligned regions in the corresponding protein sequences. The numbers between aligned blocks indicate the number of inserted residues that were omitted from the figure. The coloring reflects the conservation of amino acid residues at 85% consensus. The coloring scheme and the consensus abbreviations are as follows: hydrophobic residues (LIYFMWACV) are designated “h” in the consensus line, aliphatic (LIAV) residues are also shaded yellow and designated “l,” alcohol (S,T) is blue and designated “o,” charged (KERDH) residues are purple and designated “c,” polar (STEDRKHNQ) residues are purple and designated “p,” small (SACGDNPVT) residues are green and designated “s,” big (LIFMWYERKQ) residues are shaded gray and designated “b.” Conserved cysteines predicted to form a Zn-finger structure (C) or a disulfide bond (E) are indicated by white letters against a red background. Secondary structure elements predicted by using the PHD program are indicated in panels C and D; where “E” indicates extended conformation (b-strand) and “H” indicates the α-helix. The abbreviations for the NCLDV are defined in Materials and Methods. Additional abbreviations: AAV, adeno-associated virus 5; AcNPV, Autographa californica nucleopolyhedrovirus; Bf, Bacteroides fragilis, Ce, Caenorhabditis elegans; Cglu, Corynebacterium glutamicum; Cpf, Clostridium perfringens; Dm, Drosophila melanogaster; DpAV4, Diadromus pulchellus ascovirus; Ec, Escherichia coli; HPV08, human papillomavirus type 8; Hs, Homo sapiens; LcbA2, Lactobacillus casei bacteriophage A2; Mace, Methanosarcina acetivorans; MStV, maize streak virus; phi-105, Bacteriophage phi-105; phiC31, Bacteriophage phiC31; Polio, human poliovirus 1; SacV, Spodoptera exigua ascovirus; Si, Sulfolobus islandicus; SV40, Simian virus 40; Xf, Xylella fastidiosa.