Ecology of inorganic sulfur auxiliary metabolism in widespread bacteriophages

Microbial sulfur metabolism contributes to biogeochemical cycling on global scales. Sulfur metabolizing microbes are infected by phages that can encode auxiliary metabolic genes (AMGs) to alter sulfur metabolism within host cells but remain poorly characterized. Here we identified 191 phages derived from twelve environments that encoded 227 AMGs for oxidation of sulfur and thiosulfate (dsrA, dsrC/tusE, soxC, soxD and soxYZ). Evidence for retention of AMGs during niche-differentiation of diverse phage populations provided evidence that auxiliary metabolism imparts measurable fitness benefits to phages with ramifications for ecosystem biogeochemistry. Gene abundance and expression profiles of AMGs suggested significant contributions by phages to sulfur and thiosulfate oxidation in freshwater lakes and oceans, and a sensitive response to changing sulfur concentrations in hydrothermal environments. Overall, our study provides fundamental insights on the distribution, diversity, and ecology of phage auxiliary metabolism associated with sulfur and reinforces the necessity of incorporating viral contributions into biogeochemical configurations.


Supplementary Figure 2. DsrC protein alignment and conserved residues in microbial and phage sequences.
Highlighted amino acids indicate pairwise identity of ≥95% and colored boxes indicate strictly conserved residues (blue) or lack of conserved residues (gray). An identity graph (top) was fitted to the alignments to visualize pairwise identity at the following thresholds: 100% (green), 99-30% (yellow, scaled) and 29-0% (red, scaled). Figure 3. SoxYZ protein alignment and conserved residues in microbial and phage sequences. Highlighted amino acids indicate pairwise identity of ≥90% and colored boxes indicate substrate binding cysteine (blue) and cysteine motif (pink). An identity graph (top) was fitted to the alignments to visualize pairwise identity at the following thresholds: 100% (green), 99-30% (yellow, scaled) and 29-0% (red, scaled).

Supplementary Figure 4. SoxC protein alignment and conserved residues in microbial and phage sequences.
Highlighted amino acids indicate pairwise identity of ≥90% and colored boxes indicate cofactor coordination / active site (blue). An identity graph (top) was fitted to the alignments to visualize pairwise identity at the following thresholds: 100% (green), 99-30% (yellow, scaled) and 29-0% (red, scaled).

Supplementary Figure 5. SoxD protein alignment and conserved residues in microbial and phage sequences.
Highlighted amino acids indicate pairwise identity of ≥95% and colored boxes indicate cytochrome c motif (blue). An identity graph (top) was fitted to the alignments to visualize pairwise identity at the following thresholds: 100% (green), 99-30% (yellow, scaled) and 29-0% (red, scaled).

Supplementary
Taxonomy Myoviridae Podoviridae Siphoviridae Supplementary Figure 7. Protein grouping of mVC AMG neighbor proteins. mVC hierarchical protein grouping where each row represents a single protein group (116 total) and each column represents a single mVC (70 total). For each mVC, 11 proteins were grouped, representing the encoded AMG plus five protein before and 5 proteins after the AMG. Metadata for encoded AMGs and estimated taxonomy per mVC is shown. mVC names are appended with "A" or "B" to distinguish those that encode multiple AMGs.
Supplementary Figure 8. Mapping quality checks for phage and bacterial sulfur AMGs. a Result for phage and bacterial dsrA genes in the metagenome IMG: 3300001676. The phage-host pair contains one phage dsrA (TuiMalila_10011672) and two bacterial dsrA (TuiMalila_10106401, TuiMalila_10061351). Both the original mapping result and the mapping results including reads with one mismatch were compared. The normalized phage / bacteria gene coverage ratios were calculated for both of the above settings. The normalized phage/bacteria gene coverage ratio based on the original mapping result are shown in Fig. 7a. b Result for phage and bacterial soxYZ gene in the metagenome of IMG: 3300009154. The phage-host pair contains one phage soxYZ (Ga0114963_1000012431) and one bacterial soxYZ (Ga0114963_108352751). Both the original mapping result and the mapping results including reads with one mismatch were compared. The normalized phage/bacteria gene coverage ratios were calculated for both of the above settings. The normalized phage/bacteria gene coverage ratios based on the original mapping results are shown in Fig. 7b. Filtering steps to only retain reads with only one mismatch was conducted by mapped.py (https://github.com/christophertbrown/bioscripts/blob/master/ctbBio) with the settings of "-m 1 -p both". Mapping results were visualized by Geneious Prime v2020.1.2.