Diversity of Sea Star-Associated Densoviruses and Transcribed Endogenous Viral Elements of Densovirus Origin

The primary interest in sea star densoviruses, specifically SSaDV, has been their association with sea star wasting syndrome (SSWS), a disease that has decimated sea star populations across the West Coast of the United States since 2013. The association of SSaDV with SSWS was originally drawn from metagenomic analysis, which was further studied through field surveys using quantitative PCR (qPCR), with the conclusion that it was the most likely viral candidate in the metagenomic data based on its representation in symptomatic sea stars compared to asymptomatic sea stars. We reexamined the original metagenomic data with additional genomic data sets and found that SSaDV was 1 of 10 densoviruses present in the original data set and was no more represented in symptomatic sea stars than in asymptomatic sea stars. Instead, SSaDV appears to be a widespread, generalist virus that exists among a large diversity of densoviruses present in sea star populations.

We report the discovery of Ͼ30 novel sea star densoviruses associated with sea stars from the Southern Ocean around Antarctica (11 genomes) and from the temperate eastern Pacific (24 genomes), as well as the observation of numerous endogenized viral elements (EVEs) from sea star transcriptomes and genomes. We found that SSaDV putatively has a wide tissue tropism and is associated with sea stars in the eastern Pacific across a broad latitudinal range from southern California to Alaska, corroborating previous findings by Hewson et al. (11). The identification of SSaDV as one among many densoviruses infecting sea stars on the West Coast of the United States suggests that densoviruses may comprise a normal component of the sea star microbiome, bringing into question the association of densoviruses with SSWS.

Reanalysis of metagenomes published in the work of Hewson et al.
The reanalysis of the viral metagenomic data presented by Hewson et al. (11) led to the discovery of 9 additional densovirus genomes ( Fig. 1 and Table 1). The densovirus contigs ranged in length from 3,391 to 6,053 nucleotides (nt) (5,002 Ϯ 921 [mean Ϯ standard deviation]). SSaDV was the only densovirus assembled into a complete or nearly complete genome across multiple metagenomes and had the highest read recruitment among all libraries (Table 1; see also Table S5 in the supplemental material). The previously published partial SSaDV genome (5,050 nt) lacked the NS3 open reading frame (ORF) and inverted terminal repeats (ITRs) (11). In this study, we recovered three SSaDV genomes of various sizes from 3 of the 32 metagenomes (Table 1). The largest of these genomes (6,053 nt) contained the expected ORFs (NS1, NS2, NS3, and VP), ITRs, and hairpins within the ITRs and therefore likely represents a complete genome. It is possible that the ITR region of the genome is not complete, due to challenges posed by assembling regions with high frequency of repeats using short-read technology. Members of the genus Ambidensovirus typically have ITRs of Ͼ500 nt, which is considerably longer than the ITR regions we observed (9). The ITRs in SSaDV were 260 nt long on both sides of the genome and contained canonical hairpin structures that were 223 nt and thermodynamically favorable (ΔG ϭ Ϫ106. 40).  (11). SSaDV is 1 of 10 densoviruses present in the data set and based on read mapping analysis (Ն95% read identity) and is not more abundant in symptomatic than in asymptomatic individuals. (A) Relative abundance of all reads recruited to densovirus genomes. n.s., no significance, based on Welch two-sample t test (P ϭ 0.7697, df ϭ 25.137, t ϭ Ϫ0.29592). (B) Read recruitment separated by densovirus genotype.  SSaDV biogeography and tissue tropism. A total of 148 of 660 animals were positive for SSaDV based on the PCR assay, equating to a global prevalence of 22.4% ( Fig. 2 and Table S1). No samples were PCR positive from tissues collected in 2005. A total of 126 of 148 PCR amplicons were successfully Sanger sequenced, all confirming the specificity of the PCR assay (Table S1). Of the 148 SSaDV-positive sea stars, 22 were symptomatic (i.e., SSWS affected) and 126 were asymptomatic sea stars (Table S1). SSaDV was detected in 25 of the 42 locations that spanned a broad latitudinal range in the eastern Pacific from southern California to southeastern Alaska ( Fig. 2 and Table S2). Nine of 12 sea star species tested positive, which include the following species (virus positive/sample total): Pisaster ochraceus (87/287), Pisaster brevispinus (10/10), Pisaster giganteus (3/9), Pycnopodia helianthoides (4/72), Evasterias troschelii (26/100), Dermasterias imbricata (8/34), a Henricia sp. (3/18), a Leptasterias sp. (6/42), and Patiria miniata (1/85). Only one individual was tested for each of the three species in which SSaDV was not detected. These included Orthasterias koehleri, Pteraster tesselatus, and Solaster stimpsoni.
Fine dissections of Pisaster ochraceus (n ϭ 26 individuals), Evasterias troschelii (n ϭ 11), and Pisaster brevispinus (n ϭ 10) collected from Langley Harbor, WA, were used to assess putative tissue tropism. The viral prevalence among tissues was calculated by the number of tissues positive divided by the total number of tissues collected between these three species. SSaDV was detected most frequently in the pyloric caeca (89% [40/45]  Genome discovery, genome comparison, motif annotation, and phylogeny. An additional 29 densovirus genomes were recovered from newly prepared and reanalyzed sea star metagenomes (Table 1). The densovirus contigs ranged in size from 3,061 to 5,963 nt (5,179.0 Ϯ 684.4 [mean Ϯ standard deviation]). The average sizes of the ORFs found in sea star densovirus were as follows, with standard deviations in parentheses: NS1, 1,694.8 (Ϯ33.3) nt; NS2, 883.1 (Ϯ28.5) nt; NS3, 807.8 (Ϯ109.2) nt; and VP, 2,723.0 (Ϯ98.2) nt. The mean pairwise nucleotide identity was greater than amino acid identity among sea star densovirus genomes for NS1, NS3, and VP ORFs ( Fig. 3 and Fig. S1). The NS1 ORF had higher sequence conservation (55.7% average nucleotide and 43.2% average amino acid pairwise identity) than the ORFs of NS3 (32.6% average nucleotide and 18.6% amino acid pairwise identity) and VP (43.8% average nucleotide and 34.2% average amino acid pairwise identity). The current delineation for a new parvovirus species is based on the pairwise amino acid sequence identity of NS1. Parvoviruses encoding NS1 proteins with an Ͼ85% pairwise amino acid sequence identity are considered the same viral species (18). Using this species definition, 29 new sea star densovirus species were defined from the 39 genotypes discovered. There were 8 viral species that contained 2 or 3 genotypes and 21 species that contained a single genotype (Table 1). All sea star densoviruses discovered thus far have ambisense genomes that fall into subgroups A and B, which differ only by the VP ORF organization (19) (Fig. 4). The NS1 and VP ORFs identified in this study contain all the expected motifs that are characteristic of densoviruses (20). These motifs include RCR I and RCR II of the replication initiation motifs, Walker A, B, and C of the NTP-binding and helicase motifs, and the viral phospholipase A 2 motif.
Identification of EVEs of densovirus origin. A total of 8 of the 179 transcriptomic libraries contained contigs with densovirus-like sequences. Ten densovirus-like sequences were found among the 8 libraries based on homology searches against the sea star-associated densovirus database. These densovirus-like sequences encoded only part of NS1, lacked RCR motifs, and were not the typical coding length found in sea star-associated densovirus genomes. These sequences are likely endogenized viral elements of densovirus origin due to the fragmentation of the viral genome and the missing enzymatic motifs. The endogenized densovirus elements primarily contained Walker box ATPase motifs with homology to those of parvoviruses and densoviruses from a broad diversity of hosts ( Fig. 5 and Table S7). The transcriptomes containing densoviruslike sequences came from the following species: Acanthaster planci (SRA run identifier DRR072325), Patiria pectinifera (SRR5229427), Echinaster spinulosus (SRR1139455 and SRR2844624), Acanthaster brevispinus (SRR9276461), Linckia laevigata (SRR5438553), and Asterias rubens (SRR1139190 and SRR3087891) (Table S6).

DISCUSSION
The initial investigation for a viral agent associated with SSWS involved viral metagenomic surveys (DNA and RNA) to compare the viral consortia between species and between asymptomatic and symptomatic individuals to determine the most likely viral candidate for further investigation (11). The conclusions from these molecular surveys were that SSaDV was more represented in metagenomic libraries of symptomatic individuals and was present in symptomatic metagenomes prepared from multiple sea star species (11). However, these data show that SSaDV was 1 of 10 densoviruses present and was neither more abundant by number of reads per library comparing asymptomatic to symptomatic individuals (based on read mapping analysis) nor more prevalent between libraries comparing symptomatic to asymptomatic individuals ( Fig.  1 and Table S5). This result contradicts the original conclusion from the metagenomic data that SSaDV was associated with SSWS. The difference in results is directly attributable to the difference in the bioinformatic assembly approach. The original analysis took an overlap-layout-consensus global assembly approach using the 28 DNA viral metagenomes not including the 4 RNA viral metagenomes (11). In this study, we chose SPAdes, a more sensitive de Bruijn graph assembler, and included the RNA viral metagenomes in the analysis which contained six of the nine novel densoviruses found in this data set (Table 1). We did find SSaDV to be, on average, the most abundant densovirus by read mapping analysis and, thus, the most consistently assembled, likely biasing its assembly and discovery in the original analysis. The higher representation of SSaDV among metagenomic libraries may be the result of higher viral enrichment in those samples prior to sequencing rather than a result of greater viral loads prior to metagenomic preparation. While a higher abundance of SSaDV could reflect important host-virus biology, it remains to be determined whether greater viral loads, measured by quantitative PCR (qPCR), of SSaDV has any biological significance. A pitfall of viral metagenomics from animal tissue using a viral enrichment method is the significant variability in nonviral genetic material between viromes within a study, making quantitative comparisons difficult (21). According to our metagenomic survey, SSaDV is one species within a diverse extant population of densoviruses present in sea star populations on the West Coast of the United States.
The discovery of densoviruses associated with sea stars collected from China, Antarctica, and the Pacific and Atlantic coasts of the United States indicates their ubiquitous distribution and substantial extant diversity (Table 1, Fig. 3, and Fig. S1). The diversity observed in this study is likely a small fraction of the total diversity among H X H X X H D C Y X X X X X X S P -X S X G K N X F F D W N E X N Y R T P X I X X X N P X X X X X G P G N X X -X X X X X X X X X D X I X X X H D X X Y X X A X X X X D X X X A D Diversity of Sea Star-Associated Densoviruses Journal of Virology echinoderms, considering that these viruses have also been found in sea urchins (14). Sea star-associated densoviruses also seem to be pervasive in wild populations. The two densovirus genotypes with the best-described ecological characteristics, SSaDV and AfaDV, share striking similarities. Both viral genotypes are not species specific, are found across a large geographic range, are commonly found in asymptomatic individuals, and have a wide tissue tropism, with pyloric caeca being the primary tissue of detection ( Fig. 2) (13). This set of characteristics suggests that both viruses form persistent infections in sea stars. The genus Ambidensovirus, to which both previously described sea star-associated densoviruses belonged to based on genome organization, was recently divided into seven newly proposed genera to resolve paraphyly within the genus (8). In this new arrangement, SSaDV and Cherax quadricarinatus (shrimp) densovirus (CqDV; the densovirus most genetically similar to SSaDV prior to the discovery of AfaDV) were assigned to the genus Aquambidensovirus, putatively uniting all aquatic densoviruses (8,13,22). Our phylogenetic analysis did not support the monophyly of sea star-associated densoviruses within the newly proposed Aquambidensovirus genus, nor did all aquatic densoviruses cluster into a single well-supported clade ( Fig. 3 and Fig. S1). Newly proposed classification schemes within the Ambidensovirus genus would greatly benefit from the inclusion of broader taxonomic sampling before proposing new systematic arrangements of this highly diverse genus.
Given the lack of immortal cell cultures, the discovery of echinoderm densoviruses has been primarily through metagenomics, and that constraint was the motivation for our analysis of transcriptomes as an additional and alternative option for densovirus discovery. Host transcriptomes have been a rich source of viral discovery from eukaryotes and have expanded our knowledge of host associations for many viral groups, including parvoviruses (12,23,24). However, we did not find transcriptomes to be an effective method for the purpose of densovirus discovery compared to viral metagenomes, particularly RNA viral metagenomes. This could be due to various methodological reasons. First, the viral metagenomes prepared for this study were enriched for encapsulated nucleic acids with a cDNA enrichment step; in contrast, transcriptomes target mRNA through rRNA depletion and/or through selection for poly(A) tails. Second, to detect DNA viruses from a host transcriptome requires tissue containing an active infection, which may not be detectable without very high sequence depth. The transcribed EVEs found in this study were detected only in transcriptomes larger than 2.4 Gb. Third, the genomes discovered in the RNA viral metagenomes are likely ssDNA that was carried through the RNA extraction process. ssDNA is an uncommon nucleic acid template for nonviral material and a difficult template to remove during RNA extraction. Most commercial kits use DNases that preferentially target double-stranded DNA (dsDNA) and inefficiently cleave ssDNA. Without preferentially targeting mRNA prior to cDNA synthesis in addition to enriching for encapsulated nucleic acid, the chances of picking up ssDNA in a pool of RNA is much higher.
None of the transcriptome-derived densovirus-like sequences appeared to be extant densoviruses based on ORF architecture and motif repertoire. We conclude that these densovirus-like sequences are likely transcribed EVEs present in host cells. EVEs from Asterias rubens and Acanthaster planci could be traced to their genomes, while our inability to trace others reflects the lack of publicly available host genomes. The putative EVEs present in Asterias rubens were nearly identical to those previously reported for the same host (25), though most that we observed had low sequence identity to previously identified EVEs from other invertebrates. It is likely that these EVEs have been established in the germ line of Asterias rubens, and our findings corroborate previous work proposing that sea star densoviruses can infect germ line cells (13). The expression of these EVEs in Asterias rubens was found to trigger the RNA interference (RNAi) response, specifically the Piwi-dependent pathway (piRNA), signifying that these EVEs are still recognized as foreign and are regulated through the immune system (25). This RNAi response has been widely observed in terrestrial invertebrate genomes containing EVEs descending from densoviruses (26). We expect that the expansion of echinoderm genomes, and corresponding small RNA libraries, will further support this conclusion.
Essentially all observed EVEs retained the Walker box ATPase motifs, which collectively function as a helicase (Fig. 5) (27). This helicase domain belongs to the superfamily III helicases (SF3), which are more broadly grouped as AAAϩ ATPases (28). SF3 helicases are encoded only by DNA and RNA viruses, so their presence in cellular genomes must be the result of endogenization (29,30). The retention of the Walker box ATPase motifs among endogenized densovirus elements has been observed across a diverse range of invertebrate hosts, suggesting a beneficial function for co-opting and possibly maintaining the function of the SF3 helicase (10,12,26,31,32). The adaptive benefit of a Walker box ATPase-containing EVE has been demonstrated in the pea aphid (Acyrthosiphon pisum), in which wing development was regulated by two modified densovirus NS1 EVEs, which only retained the Walker box ATPase motifs (32). The expression of these two EVEs under crowded conditions initiated wing development, which could be suppressed by knocking down their expression. These results demonstrate that this viral gene can be co-opted by the host to modulate the response of a phenotypically plastic trait to environmental cues. Another plausible hypothesis for EVE function is the ability to enhance or prime the immune system against new infections (26). However, we observed little sequence identity between extant sea star-associated densoviruses and the EVEs in their transcriptomes. These sequence differences suggest that these EVES are unlikely to have a role in priming the piRNA response against new infections.
We employed metagenomic and transcriptomic approaches to explore the diversity of sea star-associated densoviruses while advancing our understanding of the biogeography of SSaDV, the first densovirus found in sea stars. Empirically, we found that viral metagenomes provided a more effective resource for densovirus discovery than host transcriptomes. We discovered 37 new densovirus genomes from sea stars and identified EVEs expressed in host transcriptomes that are of densovirus origin based on detection of the tripartite SF3 helicase domain in these EVEs. Using PCR, we found SSaDV to have a putatively wide tissue tropism, with the pyloric caeca being the most consistent tissue for viral detection. SSaDV was detected across a broad latitudinal range in the northeastern Pacific from southern California to Alaska and found in tissues in nearly all sea star species tested. These results corroborate the hypothesis that these viruses are common among populations and suggest that they form persistent infections in sea stars. Given the diversity of densoviruses and their broad distribution among tissues, populations, and species of both asymptomatic and symptomatic sea stars, we propose that the association of SSaDV with sea star wasting syndrome should be critically reassessed relative to the mounting evidence that this virus may not be the pathogen that causes this disease and instead a common constituent of these animals' microbiomes.

MATERIALS AND METHODS
Tissue collection and DNA extractions. We collected 887 tissue samples from 660 individual sea stars spanning 12 species from 42 locations from the temperate eastern Pacific coast of the United States from 2005 and 2014 to 2019 (Tables S1, S2, and S3). Thirty samples were collected during the peak of SSWS epidemic observed from mid-2013 to 2015 in the Northeast Pacific (33). The majority of samples collected were from asymptomatic individuals (85% of sea stars sampled). Tissues were collected from sea star specimens by vivisection immediately upon collection followed by flash-freezing in liquid nitrogen and storage at -80°C or -20°C until dissection or were sampled from individuals in the field nonlethally and preserved in RNAlater (Sigma Aldrich) or ethanol (EtOH) ( Table S1). Coelomic fluid samples were collected only from vivisected specimens using a 25-gauge 1.5-in. (0.5-mm by 25-mm) needle attached to a 3-ml syringe inserted through the body wall into the coelomic cavity. DNA was extracted from tissues and coelomic fluid using the Zymo Research Quick-DNA Miniprep Plus kit or the Zymo Research Duet DNA/RNA Miniprep kit following the manufacturer's protocol. DNA was quantified using a NanoDrop or Quant-iT PicoGreen dsDNA assay kit (Invitrogen) (Tables S1 and S3).
PCR and Sanger sequencing of SSaDV. Primers were designed targeting the structural gene of SSaDV using Primer3web (version 4.1.0). The primers were as follows: VP1 forward primer (5=-TGGCCA CTCATCATGTCTCT-3=) and VP1 reverse primer (5=-CTTGGGGTCCTTCATGAGC-3=). New England BioLabs (NEB) Q5 high-fidelity DNA polymerase was used following the manufacturer's protocol for a 50-l reaction volume. Thermal cycling was performed in a Bio-Rad C1000 thermal cycler using the following conditions: initial denaturing at 98°C for 30 s followed by 35 cycles of denaturing (98°C for 10 s), annealing (67°C for 20 s), and extension (72°C for 20 s) and a final extension at 72°C for 2 min. Annealing temperature was based on NEB Tm Calculator recommendation using default primer concentration of 500 nM. The resulting amplicon of the PCR was 534 nucleotides (nt). All PCRs included a positive control, a kit negative control, and PCR reagent negative control to account for false positives and false negatives. A total of 10 to 15 l of a PCR product was used for gel visualization. The remaining PCR product was processed using a ZR-96 DNA Clean & Concentrator-5 kit (Zymo Research) and submitted to the Cornell Core Biotechnology Resource Center Genomics Facility for DNA sequencing. DNA sequencing was performed on Applied Biosystems automated 3730xl DNA analyzers using BigDye Terminator chemistry and AmpliTaq-FS DNA polymerase. Viral metagenome preparation. Five RNA viral metagenomes were prepared from five different species of sea stars: Pisaster ochraceus (n ϭ 1), Labidiaster annulatus (n ϭ 1), Leptasterias sp. (n ϭ 1), Mediaster aequalis (n ϭ 1), and Neosmilaster georgianus (n ϭ 1) ( Table S4). The preparation of viral metagenomes followed the method of Hewson et al., modified from that of Thurber et al. (17,34). Pyloric caeca from sea stars were homogenized in a 10% bleach-cleaned NutriBullet with 0.02-m-filtered 1ϫ phosphate-buffered saline (PBS). Tissue homogenates were pelleted by centrifugation at 3,000 ϫ g for 5 min, and the supernatant was syringe filtered through Millipore Sterivex-GP 0.22-m polyethersulfone filters into 10% bleach-treated and autoclaved Nalgene Oak Ridge high-speed centrifugation tubes. Filtered homogenates were added to 10% (wt/vol) polyethylene glycol 8000 (PEG 8000) in 0.02-mfiltered 1ϫ PBS with a final volume of 35 ml and precipitated for 20 h at 4°C. Precipitated nucleic acids, and other cell material, were pelleted by centrifugation at 15,000 ϫ g for 30 min. The supernatant was decanted and pellets were resuspended in 2 ml of 0.02-m-filtered 1ϫ PBS. Half of the sample (1 ml) was treated with 0.2 volume (200 l) of CHCl 3 , inverted three times, and incubated at room temperature for 10 min. After a brief centrifugation, 800 l of supernatant was transferred into a 1.5-ml microcentrifuge tube. Samples were treated with 1.5 l of Turbo DNase (2U/l; Invitrogen), 1 l of RNase One (10 U/l; Thermo Scientific), and 1 l of Benzonase nuclease (250 U/l; Millipore Sigma) and incubated at 37°C for 3 h. A total of 0.2 volume (160 l) of 100 mM EDTA was added to the sample after incubation. Viral RNA was extracted using the ZR viral RNA kit (Zymo Research). RNA was converted into cDNA and amplified using the WTA2 kit (Sigma-Aldrich). Prior to sequencing, samples were processed using a ZR-DNA Clean & Concentrator-5 kit, and DNA was quantified with a Quant-iT PicoGreen dsDNA assay kit. Samples were prepared for Illumina sequencing using the Nextera XT DNA library preparation kit prior to 2 ϫ 250-bp paired-end Illumina MiSeq sequencing at the Cornell Core Biotechnology Resource Center Genomics Facility.
Metagenome-derived viral genome discovery. Raw paired-end reads were quality trimmed to remove Illumina adapters and phiX contamination. Reads were merged and normalized to a target depth of 100 and a minimum depth of 1 with an error correction parameter. Read quality filtering, trimming, contamination removal, merging, normalization, and read mapping were performed using the BBtools suite (35). Both merged and unmerged reads were used for de novo assembly using the default parameters excluding the read error correction option in SPAdes v3.11.1 (36). Contigs shorter than 3000 nt were discarded after assembly, and the remaining contigs were subjected to tBLASTx against a curated in-house database containing 453 genomes from all nine families of eukaryotic ssDNA viruses. Contigs with significant sequence similarity at an E value of Ͻ1 ϫ 10 Ϫ8 to a densovirus genome were reviewed in Geneious version 9.1.5 (37). Open reading frames (ORFs) were called in Geneious using a minimum size of 550 nt with a standard genetic code and a start codon of ATG. Hairpin structures were identified using Mfold (38). After verification of the contigs as densovirus sequences, reads were mapped back to contigs with a minimum identity of 0.95 to obtain average read coverage and total reads mapped to contigs (Table S5). All densovirus sequences have been deposited in GenBank (see below). In addition to the 5 viral metagenome libraries sequenced in this study, we reanalyzed 30 DNA and 21 RNA viral metagenomes published elsewhere (NCBI BioProject numbers PRJNA253121, PRJNA417963, and PRJNA637333) using the assembly approach described above (Table S4) (11,17).
Sea star transcriptome analysis. A total of 179 sea star transcriptome-sequenced (RNA-seq) paired-end libraries were downloaded from the NCBI database (Table S6). FastX was used to remove reads with lengths of Ͻ50 nt and a quality score of Ͻ30 (39). Trimmomatic was used to trim adapters (40). Libraries were assembled using default parameters in Trinity v2.1.1 (41). Assembled contigs were annotated against a protein database of sea star densoviruses using DIAMOND with an E value cutoff of Ͻ1 ϫ 10 Ϫ5 (42). Contigs with significant similarity to sequences in the sea star-associated densovirus protein database were isolated, ORFs were called, and amino acid sequences from ORFs were further checked by BLASTp against the NCBI nonredundant database. Top BLAST results for each densovirus-like sequence against the NCBI nonredundant database were downloaded for amino acid MUSCLE alignment and visualized using Geneious (37,43). To verify if the sequences were from the host, BLASTn was performed querying isolated densovirus-like sequences against available sea star genomes (Asterias rubens, NCBI taxid 7604, and Acanthaster planci, NCBI taxid 133434).
Densovirus phylogenetics. Phylogenetic analysis was performed on 89 densovirus sequences that included 39 sea star-associated densoviruses and 50 densovirus sequences from complete or nearly complete genomes available in the NCBI database. All densovirus sequences included in the phylogeny were from extant viruses (i.e., no endogenized densovirus elements). Amino acid sequences from the NS1 gene were aligned with MUSCLE using default parameters (43). The region of NS1 used for alignment (sequence length of 434.9 Ϯ 40.2 [mean Ϯ standard deviation]) spanned motif I of the replication initiation motifs past Walker C of the Walker box ATPase motifs. Phylogenetic relationships between densovirus genomes were inferred by an LG ϩ G ϩ I ϩ F substitution model selected by smart model selection (SMS) in PhyML 3.0 (44). Branch support was determined by bootstrapping for 100 iterations. The resulting maximum likelihood phylogenetic tree was visualized and annotated using iTOL (45). CD-HIT was used to identify viral species using an 85% amino acid sequence identity of NS1 (18,46). Data availability. Densovirus genome sequences have been deposited in GenBank under accession numbers MT733013 to MT733051 (Table 1). Raw viral metagenomic sequence data have been deposited under BioProject numbers PRJNA253121, PRJNA417963, and PRJNA637333 (Table S4). Assem-