Cluster tools. (A) Detailed multiple alignment view for cluster PRK05506 (bifunctional sulfate adenylyltransferase subunit 1/adenylylsulfate kinase protein—Figure 1B). The detailed alignment view provides the capability to display the alignment that is color-coded by conserved amino acid property, which highlights residues at 80% or greater in the following redundant groups: aromatic (FHWY); aliphatic (ILVA); hydrophobic (ACFILMVWY); alcohol-containing (STC); charged (DEHKR); positive (HKR); negative (DE); polar (CDEHKNQRST); tiny (AGS); small (ACDGNPSTV); or bulky (EFIKLMQRWY); or by consensus mode as shown in the next panel. (B) The top panel includes information and controls for the alignment as well as a download button (FASTA + gap). Domains and features aligned against each protein (drawn as colored bars under the protein sequence) are from CDD. In this example, two domains are displayed in the alignment drawn as colored boxes below the sequence for the two highlighted proteins from Frankia: cd04095, domain II of ATP sulfurylase, brown on the left and cd0207—adenosine 5′-phosphosulfate kinase, blue on the right, with a ligand-binding site in the feature row above the protein sequences. (C) Phylogenetic tree for PRK12351 (methylcitrate synthase). At the top is the toolbar with information and controls for distance method, tree construction method and the collapse level (by taxonomic rank). Below is the tree which in this image has been rerooted, showing archaeal proteins highlighted in red (in this case from checkboxes from the protein table for this cluster) and expanded to show every leaf. Transformations of the tree can be done by clicking on the tree itself (reroot, squeeze, collapse and expand). (D) Cluster pattern view for PRK05325 (hypothetical protein). The pattern tool allows for exploration of conserved gene neighborhoods. Whereas, the protein table and ProtMap shows the complete genomic region around each gene encoding a protein in a cluster, the pattern tool collects conserved patterns that occur in three or more genomes, in a maximum window of 40 genes upstream or downstream. The most conserved pattern is shown at the top (and on the overview page—Figure 1A8) and the number of conserved proteins which is the number of sequences contributing to the same pattern (which may be from the same nucleotide sequence if present as paralogs in the same cluster), number of clusters in the conserved pattern and common taxonomic node are shown in the table to the left of the patterns. The pattern itself shows all clusters in each pattern and is pseudo-aligned, with the same cluster in each row aligned. Clusters are color-coded according to COG functional categories and the accession is linked to the cluster, the cluster pattern or the ProtMap for that particular cluster. Gray boxes indicate an insertion into the pseudo-alignment for alignment purposes and does not reflect a cluster (gene/protein) at that position in the genome. The size of each box is not proportional to the size of the gene as the size of the arrows is in ProtMap. The gene neighborhood around the genes encoding the hypothetical proteins for PRK05325 (no function yet determined) show conservation of association with putative serine protein kinases (the yellow category apparently involved in signal transduction—a set of uncurated clusters encoded by genes 5′ of the genes encoding proteins in PRK05325). (E) Limited ProtMap view for PRK08568 (preprotein translocase subunit SecY). The ProtMap view shows the full gene neighborhood in a limited horizontal window, unlike the cluster pattern tool which shows a more condensed and taxonomically conserved view of the same information but with a potentially wider window. Note that the genes are drawn to scale in this view. In this example, the Methonococcus spp. RefSeq Nucleotide Accession Numbers are highlighted in yellow on the left to show that the secY gene (cluster PRK08568) is found upstream of a glycosyl transferase encoding gene (CLS1191473—color-coded yellow for cell wall biogenesis); whereas, in most other organisms secY is upstream of adenylate kinase (PRK04040—colored blue for nucleotide transport and metabolism). Note that PRK04040 contains a large set of contributing sequences that are not shown in the image for brevity. The pattern tool can be used to control the display of the ProtMap, directing the display to only show the ProtMap for a particular pattern.