Regional centromere configuration in the fungal pathogens of the Pneumocystis genus

ABSTRACT Centromeres are constricted chromosomal regions that are essential for cell division. In eukaryotes, centromeres display a remarkable architectural and genetic diversity. The basis of centromere-accelerated evolution remains elusive. Here, we focused on Pneumocystis species, a group of mammalian-specific fungal pathogens that form a sister taxon with that of the Schizosaccharomyces pombe, an important genetic model for centromere biology research. Methods allowing reliable continuous culture of Pneumocystis species do not currently exist, precluding genetic manipulation. CENP-A, a variant of histone H3, is the epigenetic marker that defines centromeres in most eukaryotes. Using heterologous complementation, we show that the Pneumocystis CENP-A ortholog is functionally equivalent to CENP-ACnp1 of S. pombe. Using organisms from a short-term in vitro culture or infected animal models and chromatin immunoprecipitation (ChIP)-Seq, we identified CENP-A bound regions in two Pneumocystis species that diverged ~35 million years ago. Each species has a unique short regional centromere (<10 kb) flanked by heterochromatin in 16–17 monocentric chromosomes. They span active genes and lack conserved DNA sequence motifs and repeats. These features suggest an epigenetic specification of centromere function. Analysis of centromeric DNA across multiple Pneumocystis species suggests a vertical transmission at least 100 million years ago. The common ancestry of Pneumocystis and S. pombe centromeres is untraceable at the DNA level, but the overall architectural similarity could be the result of functional constraint for successful chromosomal segregation. IMPORTANCE Pneumocystis species offer a suitable genetic system to study centromere evolution in pathogens because of their phylogenetic proximity with the non-pathogenic yeast S. pombe, a popular model for cell biology. We used this system to explore how centromeres have evolved after the divergence of the two clades ~ 460 million years ago. To address this question, we established a protocol combining short-term culture and ChIP-Seq to characterize centromeres in multiple Pneumocystis species. We show that Pneumocystis have short epigenetic centromeres that function differently from those in S. pombe.

is a remarkable diversity in centromere structures ranging from sequence-dependent (point) centromeres to epigenetically regulated (regional) centromeres.
Point centromeres are exemplified by Saccharomyces cerevisiae with genetically defined 125-bp long centromeres containing conserved DNA elements (I, II, and III) that serve as anchors for the recruitment of the centromeric DNA binding factor 3 (CBF3) complex (5).Regional centromeres are much longer, lack universally conserved DNA patterns, and are often made up of repetitive DNA (6).Regional centromeres are not strictly defined by DNA but are rather epigenetically regulated; however, which factors contribute to this epigenetic specification of centromeres remains elusive.In fungi, regional centromeres are classified as short [<20 kb, e.g., Candida albicans (7)], intermediate  kb, e.g., Schizosaccharomyces pombe (8)], and long [150-300 kb, e.g., Cryptococcus (9)].
Pneumocystis is a genus of pathogenic fungi that exclusively infect mammals and remain unculturable.They belong to the Taphrinomycotina subphylum and form a monophyletic clade with fission yeasts Schizosaccharomyces, with a separation time estimated at ~460 million years ago (10,11).In addition to these two classes, there are other classes, all of which are plant or soil-adapted organisms.Compared with S. pombe, which is a tractable organism that is widely used as a model for chromosome and cell biology, Pneumocystis species have streamlined genomes due to substantial gene losses during their transition to animal parasitism.Recently, the genomes of multiple Pneumocystis species have been sequenced (11)(12)(13)(14).However, centromeres were not defined in these because they could not be predicted bioinformatically without reference for synteny-based detection.Alternative methods such as circular chromosome conformation capture assay (4C) that have been used to predict centromere loci in fungal genomes (15) are not applicable to Pneumocystis because they require large quantities of pure DNA from synchronized cell cultures.No long-term culture system exists, and during purification of Pneumocystis organisms from infected lungs, it is virtually impossible to completely eliminate host cell DNA.Pneumocystis have retained CENP-A (11,14), which presumably binds to centromeres, and most of the kinetochore proteins.Some data suggest that Pneumocystis centromeres differ structurally from those of S. pombe.For example, the heterochromatin protein Swi6/HP1 (16), which is required for the activation of replication origins at the centromere flanking regions (pericentro meres), has not been found in Pneumocystis genomes (17).The RNA interference (RNAi) pathway, which is essential to centromere function, has been lost in Pneumocystis (18), further supporting mechanistic difference with S. pombe.
In this work, we undertook to characterize Pneumocystis centromeres and compare them to Schizosaccharomyces pombe centromeres.We found that Pneumocystis have small regional centromeres that are defined by CENP-A and flanked by heterochromatin, resembling those of S. pombe.However, Pneumocystis species lack orthologs for genes required for centromere function and maintenance in S. pombe, suggesting that its centromeres may function differently.
Using a combination of protein domain hidden Markov models and BLAST scans followed by phylogenetic assignment (see Materials and Methods), we searched for homologs of CENP-A Cnp1 , CENP-C Cnp3 , CCAN (Constitutive Centromere Associated Network), and KMN (KNL-1/Mis12 complex/Ndc80 complex) proteins.We extended our searches to pathways required for centromere function and chromosomal segregation (heterochromatin, RNAi, and DNA methylation) (Fig. S1).
CENP-A, CENP-C, and outer kinetochore proteins are conserved across Taphrinomy cotina.The CENP-A histones from Pneumocystis species, henceforth referring to as pnCENP-A, display a high overall conservation (89% average protein identity).Two genes of the inner kinetochore (CCAN) that are not found in Pneumocystis are innovations in S. pombe (fta2 and fta3 genes).The fta2 and fta3 gene products associate with the central core (cnt) and innermost repeats (imr) region of the centromere in S. pombe (19).No ortholog for the mis17 gene can be found in Pneumocystis.In S. pombe, mis17 encodes a member of the Mis6-Mal2-Sim4 multiprotein required for CENP-A recruitment (20).The S. pombe centromere-associated protein B genes (cbp1, cbh1, and cbh2), which are derived from the domestication of pogo-like transposases (21), have no identified homologs in other fungi.
The chromosomal passenger complex is a heterotetrametric complex composed of the Aurora B kinase Ark1 and three regulatory components Pic1 (inner centromere protein), Bir1p (Survivin), and Nbl1 (Borealin) which mediates chromosome segregation and cytokinesis (22).This pathway is conserved in Pneumocystis and S. pombe.
The CENP-A recruiting complex, which includes Mis18 and Mis16 (23), and the CENP-A histone chaperone Scm3 required for CENP-A loading at the centromere (24), are conserved in Pneumocystis, as are the monopolin complex genes that coordinate the kinetochore microtubule attachment during mitosis (25).
The network for heterochromatin formation in Pneumocystis seems intact with the presence of the Clr4 methyltransferase complex (Clr4/SUV39H).Pneumocystis species have a single chromo domain containing protein that bears similarity with S. pombe Swi6 and Chp2 which are HP1 inparalogs (Fig. S2).
Overall, the Pneumocystis kinetochore (CCAN) protein catalog is relatively conserved and similar to that of other Taphrinomycotina fungi.

pnCENP-A Cnp1 localizes to the nucleus
Centromeres are identified by tracking CENP-A binding regions.Precise nuclear localization of CENP-A is required for accurate chromosomal segregation.To determine if pnCENP-A localizes to the nucleus, we used an anti-Pneumocystis CENP-A antibody, in conjunction with an anti-Pneumocystis antibody (29) and DAPI.Using indirect confocal immunofluorescence microscopy, we found that pnCENP-A displays a nuclear peripheral localization in replicating organisms from infected lung tissues and cultures (Fig. 1A and  B).

pnCENP-A is functional in S. pombe and supports viability and centromere loading
To confirm that the putative cenp-a gene from P. murina (Pmcenp-a) encodes a functional CENP-A protein, we expressed Pmcenp-a in S. pombe.P. carinii CENP-A has only one amino acid difference with P. murina CENP-A (Fig. 2A).Pmcenp-a was codon-optimized, expressed under the endogenous S. pombe cnp1 + , the CENP-A ortholog, gene promoter, and subcloned into a standard LEU2-based multicopy plasmid.In addition, the gene product was tagged with GFP at its N-terminus since it was shown that C-terminal tagging (i.e., cnp1 + -GFP) displays growth retardation (30).We first assessed the ability of Pmcenp-a to rescue the lethality of cnp1∆ cells.We performed plasmid shuffling assays, in which cnp1∆ cells were initially kept viable by supplying a copy of cnp1 + on a plasmid carrying the ura4 + selection marker.A rescue plasmid carrying Pmcenp-a and the LEU2 selection marker was then transformed into the cells, followed by growth on medium containing 5-Fluoroorotic acid (FOA), which selects for loss of the ura4 + -marked cnp1 + plasmid (Fig. 2B).As seen with control cells expressing the S. pombe GFP-CENP-A gene, we found that cells expressing PmCENP-A also fully support viability of cnp1∆ cells as they grew upon FOA selection (Fig. 2C).We also observed no differences in growth between cnp1∆ cells expressing either S. pombe or P. murina CENP-A (Fig. S3).We next asked if PmCENP-A can also rescue lethality of cnp1-76 cells, a thermosensitive mutant, whose mutation T74M lies at the CATD (CENP-A targeting domain) region (31).As previously reported (32), we found that the S. pombe CENP-A tagged with GFP restored growth of cnp1-76 cells at restrictive temperatures (i.e., 34°C).Importantly, the P. murina counterpart was also able to rescue the thermosensitive phenotype (Fig. 2D) proving the functionality of this gene.In interphase fission, yeast displays the Rabl configuration, in which centromeres cluster to the spindle pole body while telomeres associate to each other near nuclear periphery (33,34).As expected, expressing GFP-CENP-A from S. pombe resulted in cells displaying single fluorescence foci that colocalized with a tetO array inserted at cen2 (NB: a TetR-tdTomato fusion is expressed to bind to the tetO array and this locus is referred to as cen2-tetO-tdTomato) (Fig. 2E).Consistent with the mentioned rescue of cnp1-ts and cnp1∆ cells and further supporting their role as a functional CENP-A protein, P. murina CENP-A protein also colocalized with cen2-tetO-tdTomato and remarkably exhibited single foci, albeit the intensities of centromere foci were dimmer than that of the S. pombe counterpart (Fig. 2F).Finally, to confirm the specific localization of PmCENP-A to centromeres, we performed ChIP-qPCR (Fig. 2G).Noteworthy, we observed that PmCENP-A localizes to the central core regions (i.e., cc1&3 and cc2:ura4 + ) similar to the fission yeast counterpart, albeit the enrichment was consistently lower with the results from live-cell imaging.Additionally, PmCENP-A localization seems to be specific as no detectable enrichment was found at pericentro meric outer repeats coated with heterochromatin (i.e., dg) and euchromatin locations (i.e., fbp1).Together, these results confirm that the P. murina cenp-a gene encodes a bona fide CENP-A protein.

pnCENP-A binds to single genomic foci in Pneumocystis-replicating cells
In S. pombe, CENP-A Cnp1 is deposited at the centromeres during the G2 phase of the cell cycle (30).Although Pneumocystis cell replication is not fully understood, these organisms take 5-8 days to replicate in vivo (35).To determine pnCENP-A bound genome regions, we established a short-term co-culture system for P. murina and P. carinii (Fig. 3A).We assessed organism growth using quantitative PCR targeting the single-copy gene dihydrofolate reductase (dhfr); a typical growth curve shows a decline from day 0 to 7 followed by a gradual increase (Fig. 3B and C).Population growth is further supported by the presence of mitotic or fusing cells 7 days post culture, though some cells do not appear to be dividing (Fig. 3D).
We mapped CENP-A bound genomic regions in all P. carinii and P. murina chromo somes using ChIP-Seq from cultured organisms collected days 0 (normalized initial inoculum), 7 and 14.
In P. murina, each of the 17 chromosome-level scaffolds displays a single region containing peaks of CENP-A at all three time points (Fig. 4A and B; Fig. S4).We con firmed CENP-A enrichment by ChIP-qPCR (Fig. S5A).CENP-A enrichment has a bimodal distribution, suggesting two distinct populations of CENP-A in chromosome arms.The two CENP-A peaks are separated by a 1-kb non-coding region that lacks a conserved DNA motif.Although the significance of two peaks of CENP-A in Pneumocystis is unclear, in S. pombe CENP-A, cores are interspersed with H3K4me (36).CENP-A enrichment levels Bottom panels 1-8 are magnified fields.Most anti-pnCENP-A labels overlap or are close in proximity with DAPI (represented by a pink color produced by the overlap between red and blue colors).Some organisms are not labeled by 7C4 antibody (e.g., box 5), which suggests that the epitope is not expressed.This field does not contain any host cells (rat).Scale bar = 2 µm.(B) In plane and orthogonal views of immunofluorescence labeling of P. carinii organisms co-cultured with mammalian cells for 7 days.Organisms were triple labeled using an anti-CENP-A antibody (red), a different anti-Pneumocystis antibody (RAE7) targeting Pneumocystis major surface glycoproteins (green), and DAPI (blue).Panels 1-3 are magnified fields.CENP-A is located inside Pneumocystis cells at the periphery of the nucleus.Few organisms are only labeled with the DAPI and do not express detectable levels of CENP-A and major surface glycoprotein.A rat cell nucleus is labeled with the letter "H." Scale bar = 1 µm.fluctuate according to the growth curve, that is, the highest level is observed at day 0 followed by a decline at day 7 and a recovery at day 14.In S. pombe, CENP-A Cnp1 enrichment at the centromeres correlates with a depletion of histone H3 (37).At day 7, the H3 occupancy (expressed as the ratio between H3 and H4) is significantly reduced at centromeres relative to flanking regions (30 kb) as estimated by ChIP-Seq (Wilcoxon test; ura4 + insertion at central core 2; dg, a class of heterochromatic outer repeats within pericentromeric regions (control); fbp1, euchromatic locus (control).P < 0.0001; Fig. S6), which suggests that CENP-A substitutes the canonical H3.However, H3 depletion at the centromeres does not persist after 14 days of culture (Fig. S7A).
P. carinii also displays monocentric chromosomes enriched with two CENP-A peaks at the three time points by ChIP-Seq (Fig. 4C and D; Fig. S8) and ChIP-qPCR (Fig. S5B).H3 depletion is observed at day 14 (Fig. S7B).In contrast to P. murina, we found a significant reduction of CENP-A enrichment at day 14 despite a population growth (Fig. 3C).Also not observed in P. murina, at day 14, low levels of CENP-A are present outside the primary CENP-A binding region in several chromosomes (Fig. S8), which correspond to a non-specific displacement of CENP-A.In budding yeasts and chicken cells, low levels of CENP-A molecules have been shown outside the core domain (CENP-A cloud), which contribute to centromere plasticity (38).
In both species, CENP-A bound regions span 4.8-8.0kb, with an average length of 6.7 kb, and have an average of 1.03% lower GC content than the rest of the genome (Table 1).Using dot plot analysis of a concatemer of different centromeres, each CENP-A binding region appears unique within the genome (Fig. 4E and F) and lacks a shared conserved DNA motif (see Materials and Methods).Because most centromeres in fungi are associated with repetitive DNA (e.g., DNA transposons and retrotransposons), we searched for known signatures of repeats and found no significant overlap of repeats with centromeres (Fig. S6 and S9).

Pneumocystis centromeres contain active genes
By cross referencing annotated gene locations with those of pnCENP-A bound regions, we found that P. murina and P. carinii centromeres encode 74 and 58 genes, respectively; all of them are conserved in the genomes of both species except a prefoldin gene that is lost in P. carinii (Table S2 at https://doi.org/10.5281/zenodo.10574230).Of the 74 P. murina centromeric genes, 73 were expressed based on RNA-seq and four were further detected by protein mass spectrometry mapping (LC-MS); of the 58 P. carinii genes, 56 are expressed and 53 were detected by LC-MS.Analysis of the predicted function of these genes revealed housekeeping functions without major differences compared with randomly sampled genes.

Centromeres are flanked by heterochromatin
In S. pombe, H3K9me2/3 marking of chromatin is associated with the repression of centromeres, subtelomeres, ribosomal rDNA, and the mating locus (36).There are two types of chromatins: euchromatin, the lightly packed form of the chromatin enriched with H3K4me2 (di-methylation of the fourth lysine residue of the histone H3), which is associated with active transcription, and heterochromatin enriched with H3K9me2 and H3K9me3 (di-and tri-methylation of the nineth lysine residue of the histone H3 protein).To test if this feature is shared in Pneumocystis, we performed ChIP-Seq with antibodies targeting histone H3 modifications (H3K9me2/3 and H3K4me2).In Pneumocystis, we detected broadly distributed peaks of H3K9me2 and H3K9me3 bordering the putative and Let-1 cells for 7 days (input subtracted).Data for the full experiment covering days 0, 7, and 14 are presented in Fig. S6 and supplemental data at https:// doi.org/10.5281/zenodo.10574230.(E) A DNA dot plot of CENP-A binding regions in P. murina showing regional self-similarity.The main diagonal represents the sequence alignment with itself.Lines off the main diagonal which are repetitive patterns within the sequences are not observed.The plot shows that each CEN is unique within the genome.(F) DNA dot plot of CENP-A binding regions in P. carinii showing that each CEN is unique and repeat free.centromeres delineated by CENP-A narrow peaks (Fig. 5A and B; Fig. S6 and S9).This configuration of chromatin markers is specific to centromeric regions.H3K4me2 is correlated with H3K9me2 in the centromeres (Pearson rho = 0.72) but not at the whole genome level (rho = 0.05).However, research has shown that the histone modifications H3K4me2 and H3K9me2 decorate euchromatin and heterochromatin regions, respec tively (39).Therefore, the correlation observed between H3K4me2 and H3K9me2 may be due to differences in growth kinetics for the Pneumocystis cell population used in the ChIP analysis.Furthermore, similar to S. pombe (36), a small amount of H3K4me2 is present at the centromeres in Pneumocystis (Fig. 5A and B; Fig. S6 and S9).These results suggest that Pneumocystis centromeres are flanked by high levels of heterochromatin.
DNA methylation is frequently associated with centromeres in fungi, where they silence repeats (40,41).Similar to S. pombe, DNA methylation has been predicted to be absent in Pneumocystis based on the absence of DNA methyltransferases (DMTs) (27,28).To determine whether DNA methylation plays a role in Pneumocystis centromeres, we performed bisulfite sequencing (five methylcytosine) in P. carinii.The overall level of 5mC DNA methylation as measured by the average weighted methylation percentage is 0.6% for P. carinii at the CG dinucleotides (Fig. S10).These levels are in range with reported levels for other fungi, e.g., Verticillium (0.4%) (42).The presence of 5mC methylated DNA bases despite the absence of recognizable DNA methyltransferases in Pneumocys tis requires further investigation.To assess the potential role of DNA methylation in centromere function in Pneumocystis, we analyzed the DNA methylation patterns over different genomic features (genes, intergenic spacers, and centromeres).However, there is no significant difference among centromeric or pericentromeric regions (defined as 30 kb flanking the centromeres) (Mann-Whitney U test P-value >0.3) and the randomly selected genomic regions (genomic background).These results suggest that 5mC DNA methylation is not required for centromere function in Pneumocystis.The absence of repeats in the centromeres and the presence of actively transcribed genes suggest that DNA methylation does not induce an epigenetic silencing in Pneumocystis centromeres.
To investigate footprints of selection acting on centromeres, we computed conser vation scores (PhastCons) from whole genome alignments.Conservation scores range from 0 to 1 and represent probabilities of negative selection.Pneumocystis genomes encode a large multicopy major surface glycoprotein (MSG) gene family.MSGs are highly polymorphic and evolve rapidly due to their role in antigenic variation during mammalian host infection [reviewed in reference (43)].Using MSGs as control for evolutionary speed, we found that centromeres and flanking regions are substantially more conserved than MSGs and similar to the genomic background (Fig. 6B and C).The levels of centromeric conservation are variable across the chromosomes, with the flanking regions being more conserved than the cores (defined as a 1-2-kb region in the center of each centromere) (Fig. S12 at https://doi.org/10.5281/zenodo.10574230).Given that macrosynteny at the centromeric regions is conserved across multiple Pneumocystis species, this suggests that centromere locations are maintained by positive selection.This would be consistent with the hypothesis that centromere positioning tends to be conserved in obligate sexual fungi due to their role in meiosis (44).
Centromeres contribute to karyotypic diversity in some fungi (2).However, we found little support for this hypothesis here because most centromeres do not overlap with chromosomal breaks (only 3 of 17 in the P. carinii versus P. murina pairwise comparison).This is reminiscent of other fungi such as Verticillium species where centromeres do not account for most of the karyotype variation (45).

DISCUSSION
In the current study, we have identified centromeres of two Pneumocystis species by showing that the centromeric histone CENP-A binds to gene-rich genomic regions that are flanked with heterochromatin (Fig. 7).This configuration suggests the presence of small epigenetically regulated centro meres (regional).Within each genome, each chromosome has a unique non-repetitive centromere.Syntenic regions for each centromere are present in all seven currently sequenced Pneumocystis species.Cell replication and chromosomal segregation pathways are unexplored in these species.Central to these pathways are the centro meres.
As centromeres cannot be predicted bioinformatically because no suitable reference for synteny analyses existed before this study, the locations and characteristics of Pneumocystis centromeres were not reported before.Moreover, there are several roadblocks for characterizing centromeres in Pneumocystis, with the most significant one being the lack of continuous culture and transfection tools.Here, we utilized two protein-coding genes and cyan for polymorphic major surface glycoprotein genes), pnCENP-A binding region (centromere), and sequence conservation scores which were calculated from whole genome alignments of P. carinii, P. murina, and P. wakefieldiae (PhasCons).The phastCons scores represent probabilities of negative selection and range between 0 (no conservation) and 1 (total conservation).(C) Boxplot of conservation scores per genomic context summarized for centromere 4 in P. carinii genome, 30-kb regions flanking the centromeres (Cenflk), major surface glycoproteins encoding regions (Msg), and random genomic background (Bckg).Msgs are fast-evolving proteins potentially involved in antigenic variation.Background data were obtained from randomly selected intervals (n = 1 × 10 6 ) from genomic regions excluding above-mentioned regions (CEN, Cenflk, and Msg).Statistical differences for the indicated comparisons were obtained using one-sided non-parametric Mann-Whitney test; ****P < 0.0001, ***P < 0.001, **P < 0.01, *P < 0.05.Data for all 17 centromeres are presented in Supplementary material.
Pneumocystis species in a short-term culture system to determine where pnCENP-A, the epigenetic marker for centromeres, binds in the genome.
As a first step, we demonstrated that PnCENP-A functions similarly to CENP-A Cnp1 in S. pombe, though less efficiently, validating its use in characterizing Pneumocystis centromeres.We then developed a PnCENP-A-based ChIP-Seq assay that showed a reproducible efficiency in defining centromeric regions of Pneumocystis species.
Unlike the phylogenetically related S. pombe, in Pneumocystis, these regions overlap with active genes, but like S. pombe, these centromeres are flanked by heterochromatin.Our work also provides the first experimental evidence that DNA methylation occurs in these species, although it may not be involved in centromere function.(46), Candida albicans (7), Zymoseptoria tritici (44), and Cryptococcus neoformans (9) are used as representative to showcase the diversity of regional centromeres in fungi.Animal pathogens are highlighted in pale-olive, and the plant fungal pathogen Zymoseptoria tritici is in pale-green.For the sake of brevity, only Pneumocystis carinii is presented.In the middle are presented DNA structures of centromeres.
P. carinii has 17 centromeres (one for each of its 17 chromosomes) that share the same overall architecture.Centromeres are delineated by a localized enrichment of the centromeric histone CENP-A, which overlaps with a reduction of the canonical histone H3 (inverted grey triangle).Centromeres are flanked by heterochromatin H3K9me (here stands for both H3K9me2 and H3K9me3).All 17 Pneumocystis carinii centromeres span active genes (dark-gray boxes).Each centromere sequence is different and lacks shared DNA sequence motif.S. pombe has three centromeres that share the same overall structure in which a central core (cnt) domain is surrounded by innermost repeats (imr) and outer repeats (otr).The imr repeats incorporate clusters of transfer RNAs (tRNAs) that play a role in restricting CENP-A spread.S. pombe centromeres are flanked by heterochromatin (H3K9me).Genes are found 0.75-1.5 kb beyond the limits of the centromeres.
C. albicans has eight unique and different centromeres that are gene free and lack shared sequence motifs.Z. tritici has 21 centromeres ranging from 6 to 14 kb in size that partially overlap with genes.C. neoformans has 14 centromeres that are gene free and enriched with Tcn transposons.
Despite their close phylogenetic relationship, P. murina and P. carinii have slightly different growth kinetics in our short-term culture system, which could explain the variations in the PnCENP-A and H3 DNA binding profiles.The non-specific binding of PnCENP-A observed in P. carinii at day 14 is likely the result of a random dissociation of CENP-A molecules from the centromeres.
In P. murina, 5 out of the 17 chromosomes showed secondary CENP-A enrichment at non-centromeric sites (Fig. S4).These secondary CENP-A peaks are likely ChIP-Seq artefacts because they were only detected at day 7 post culture and mostly overlap with highly expressed loci (Fig. S8), which are susceptible to misleading signals in ChIP-Seq experiments (47).
The presence of genes within centromeres is rare and has only been described in rice (48) and the plant fungal pathogen Zymoseptoria tritici (44).Centromeres bearing genes are interpreted as young centromeres (neocentromeres), in which the genes are progressively inactivated.This hypothesis lacks support here because there is no sign of pseudogenization in these genes: nearly all are transcribed and translated, they are conserved in all species and many are involved in housekeeping cellular pathways that are presumably critical for organism survival.
Findings in Pneumocystis cannot be generalized to other Taphrinomycotina spe cies.The determination of ancestral traits between Pneumocystis and fission yeast centromeres will require characterizing the centromeres in additional Taphrinomyco tina species.The loss of RNAi is often associated with shortening of centromeres ( 9); consistent with this, Pneumocystis centromeres are much smaller than Schizosaccharomy ces centromeres.
Our study will benefit from a microscopy validation of CENP-A loading kinetics when a reliable long-term culture system becomes available.Live imaging would help to determine at which cell cycle stage CENP-A is loaded.A direct genetic confirmation of Pneumocystis centromeres will also be required when a long-term culture system and genetic manipulation tools become available.
In summary, we have identified short regional centromeres in genetically intracta ble micro-organisms.Our results provide insights into the formation of centromeres in host-adapted fungal pathogens.Our ultimate goal is to use Pneumocystis centromeres to stabilize plasmids for genetic manipulation.This is the first step along this path, which should lead to better understanding of the biology of Pneumocystis and facilitate the discovery of novel interventions to effectively control and prevent the disease caused by this pathogen.

Ethics and organism source
Studies involving mouse and rat samples were approved by the NIH Clinical Center Animal Care and Use Committee (protocol CCM 19-05), and rat Pneumocystis studies by the Animal Care and Use Committees of the Cincinnati VA Medical Center (protocol 20-11-08-01).
P. carinii organisms were collected from heavily infected lungs of corticosteroid-trea ted Sprague-Dawley male rats and P. murina from heavily infected lungs of CD40 ligand knockout mice.The list of samples is presented in Table S3 at https://doi.org/10.5281/zenodo.10574230.

Pneumocystis culture and growth quantification
P. murina and P. carinii were partially purified by Ficoll-Hypaque density gradient centrifugation (49) and frozen at −80°C in cell recovery media (GIBCO).For short-term Pneumocystis cultures, A549 (ATCC) and LET1 (a gift from Dr. Paul Thomas, St. Jude Children's Research Hospital) cells were cultured in culture medium F12 with 2.5% or 5% heat inactivated fetal bovine serum (GIBCO) and Penicillin-Streptomycin (GIBCO), plated to approximately 60%-80% confluency and incubated for 24 hours at 37°C.The next day, frozen Pneumocystis vials were thawed, washed in 50 mL 1× PBS, and centrifuged at 2,000 g for 20 minutes.Pneumocystis cell pellets were resuspended in culture medium and added to the plated cells.Media were partially changed every 3 or 4 days.At the set time points (days 7 and 14), wells were scraped for collection.Cell/organism suspensions were centrifuged at 10,000 g for 3 min, the supernatant was removed, and the pellets were frozen at −80°C.Genomic DNA was extracted using the QiAmp DNA Extraction Kit (Qiagen).Quantitative PCR targeting the single copy dhfr gene was performed as described previously (50).

P. murina CENP-A Cnp1 complementation and ChIP-qPCR in S. pombe
Standard procedures were used for fission yeast growth, genetics, and manipulations (51).The sequence encoding P. murina CENP-A gene was codon optimized for S. pombe codon bias, synthesized commercially (GenScript, New Jersey), and subcloned into pS2 vector (52) generating pPmCENP-A.NdeI-SphI fragments from both pS2 and pPmCENP-A were subcloned into the same sites of pREP81 vector to obtain LEU2-based multicopy plasmids that were used in experiments presented in Fig. 2B through F. Alternatively, plasmids were linearized with NsiI and introduced by transformation to the lys1 + of cen2-tetO-tdTomato strain (53).These strains bearing single-copy integrations were grown overnight at 30°C in rich medium YEA and used for ChIP with 2 µL of anti-GFP (ab290, Abcam).PCR oligonucleotides detecting central core (cc1&3; ura4), heterochro matic dg, and euchromatin control fbp1 were previously reported (54).

Ortholog identification
To find orthologs for kinetochore proteins, we constructed a database of 21 proteomes from fungi 15 Taphrinomycotina including 7 Pneumocystis species and 6 representative distantly related fungi (Table S1 https://doi.org/10.5281/zenodo.10574230).We queried the database with S. pombe proteins as seed using BLASTp (55) with a minimum an e-value of 1.0 × 10 −5 as threshold.
When many potential homologs were detected in Pneumocystis species, we separated orthologs from out-paralogs by searching in orthologous groups inferred from the clustering of the whole proteome data set using OrthoFinder v2.5.2 (56).A sequence was considered as ortholog if it was assigned to the same orthologous group as S. pombe protein query.We verified that the assignment was consistent with EggNOG database v6.0 (57), which contains all S. pombe proteins and a limited number of Pneumocystis proteins.If a Pneumocystis sequence was detected by BLASTp but not assigned as an ortholog, we first identified conserved domains in all sequences using InterProScan version 5.64-96.0(58).Then, we constructed multiple sequence alignment containing all sequences including the S. pombe query using MAFFT version v7.467 (59) either with the option E-INS-I or L-INS-i depending on the protein domain architecture.Alignments were manually curated using MacVector version 18.6.1 (https://macvector.com/) to remove spurious aligned regions and used to infer an approximately maximum likelihood gene tree using FastTree v2.1.10 (60).When necessary, a maximum likelihood phylogeny was performed using IQ-TREE version 1.5.5 (61).For histones, we used protein sequences identified by reference (62) to guide the classification.In case of inconclusive phyloge netic classification, we BLASTed Pneumocystis protein sequences against NCBI nr using a minimum e-value of 1.0 × 10 −5 as threshold.Retrieved proteins were added to the existing alignment, and the phylogenetic inference was repeated.
If no homologs were found in Pneumocystis using S. pombe query, we used a different seed from another Taphrinomycotina if available (e.g., Saitoella).Alternatively, proteins from, Saccharomyces cerevisiae, Neurospora crassa, and Cryptococcus neoformans were used as seeds.If no homologs were detected, we used PSI-BLASTp (55).In parallel, if S. pombe query contains a conserved protein domain, we screened our fungal database with the corresponding Pfam domain using hmmsearch (HMMER version 3.3.2;http:// hmmer.org)(63) with a cut off e-value of 0.1.Because many genes could not be found in Pneumocystis proteomes, we scanned the genomes using TBLASTn (BLAST+ 2.13.0) (64) with an e-value of 0.1 as threshold and Exonerate (65) using the protein-to-genome option.Protein to genome alignments were translated in protein sequences and aligned, and phylogenetic inference was used as described above.
To identify DNA methylases within the Taphrinomycotina, we identified annotated proteins with a DNA methylase domain [Pfam domain PF000145 (66)] using hmmsearch (HMMER version 3.3.2;http://hmmer.org)(63) with a cut off e-value of 0.01.Although no hits were found in Pneumocystis proteomes, DMTs were detected in other fungi including Neolecta irregularis and Saitoella complicata, which is consistent with a previous study (27).Pneumocystis proteomes were queried using sequences from N. irregularis and S. complicata using BLASTp (55) with an e-value of 1 × 10 −5 as cut off.This strategy identified 52 Pneumocystis proteins with local similarities, which were combined with all DMTs identified, aligned using Muscle v3.8.31 (67) and used for phylogenetic inference using FastTree.Then, we BLASTed Pneumocystis proteins against NCBI nr database (last accessed Thu Oct 12 12:07:56 EDT 2023), retrieved the top hit sequences, and added them to the multiple alignment.All 52 Pneumocystis sequences belong to related but distinct protein families (e.g., DNA helicases and SNF2 chromatin remodeling ATPases).

Electron microscopy
P. murina and A549-LET1 cells described above were grown on Thermanox coverslips (Ted Pella, Redding, CA, USA), harvested at day 7, fixed with 2% paraformaldehyde/2.5% glutaraldehyde in 0.1 M Sorenson's phosphate buffer, then post fixed with 1.0% osmium tetroxide/0.8%potassium ferricyanide in 0.1 M sodium cacodylate buffer for 1 hour, washed with buffer, and then stained with 1% tannic acid in dH 2 O for 1 hr.After additional buffer washes, the samples were further osmicated with 2% osmium tetroxide in 0.1M sodium cacodylate and then washed with dH 2 O. Specimens were then stained overnight at 4°C with 1% aqueous uranyl acetate.The cells were then washed with dH 2 O and dehydrated with a graded ethanol series, prior to embedding in Spurr's resin.Thin sections were cut with a Leica UC7 ultramicrotome (Buffalo Grove, IL) prior to viewing at 120 kV on a FEI BT Tecnai transmission electron microscope (Thermo Fisher/FEI, Hillsboro, OR).Digital images were acquired with a Gatan Rio camera (Gatan, Pleasanton, CA).

Antibodies
To obtain antibody against pnCENP-A, we created an alignment containing CENP-A orthologs from Rattus norvegicus (rat), Mus musculus (mouse), Schizosaccharomy ces pombe, and multiple Pneumocystis species (Fig. S13A at https://doi.org/10.5281/zenodo.10574230).We deliberately avoided the C-terminus, which contains the highly conserved histone fold to minimize the risk of cross reactivity with the host histones in partially purified Pneumocystis protein preparations.Immunogenic peptides were selected from the N-terminal region.Peptides with sequence similarities with rat and/or mouse proteins (NCBI accession nos.GCF_015227675.2 and GCF_000001635.27, respectively) were excluded using BLASTp (55) with a word size of 6 and a minimum e-value of 1 as cut off as well as with pattern matches using Perl regular expressions.Additional searches were performed online at https://www.uniprot.org/peptide-search.We select one immunogenic peptide conserved in both P. carinii and P. murina.Two rabbits were immunized with 20 µg of the affinity-purified peptide.The polyclo nal antibody was purified by affinity column (GenScript).No cross reactivity against the host cell lysates was seen by Western blots (Fig. S13B at https://doi.org/10.5281/zenodo.10574230).The following commercial antibodies targeting histones were used at 1:100 dilution for ChIP-Seq: H3 (ab1791, Abcam), H3K4me (39142, Active Motif ), H3K9me2 (ab1220, Abcam), H3K9me3 (ab8898, Abcam), and H4 (ab1015, Abcam).We verified that peptides used for immunization for commercial antibodies were conserved in Pneumocystis proteins.

Western blots
Pneumocystis proteins (antigens) were prepared using glass beads.Partially purified organism pellets were resuspended in 1× PBS buffer.An aliquot of 0.65 mL of 0.5 mm glass beads (Biospec Products Inc.) was added to the suspensions and vortexed for 5 minutes at 4°C.Beads were allowed to settle for less than 30 seconds, and the superna tant was transferred to a new tube for sonication.The samples were centrifuged for 20,000 × g for 10 minutes, and the supernatants were transferred to a new tube.The pellet was resuspended in 20 µL PBS.
For Western blots, cell lysates and formalin-fixed chromatin preparations were run on Tris-Glycine SDS 4-20% WedgeWell gel (Thermo Fisher Scientific) and transferred to nitrocellulose membranes, followed by 1 hour incubation in blocking buffer (1× PBS + 5% milk + 0.05% Tween 20) with shaking.Nitrocellulose blots were incubated with primary antibodies (1:100) diluted in blocking buffer for an hour with shaking, followed by washing and then 1 hour of incubation with a secondary HRP conjugated goat anti-rabbit or a goat anti-mouse IgG antibody (1:2,000) (Jackson ImmunoResearch) and developed with Pierce 1-Step Ultra TMB-Blotting solution (Pierce).

Immunofluorescence microscopy
For culture, cells were plated in chamber well slide (Millicell EZ slide, Millipore).When cells were ready, the medium was removed and washed with cold PBS 1×, fixed with 4% formaldehyde for 10 minutes at room temperature, and washed three times.Cells were permeabilized with 0.2% Triton X-100 for 30 minutes and followed by a washing with 1× PBST (PBS with 0.05% Tween 20).
Pneumocystis carinii-infected lung tissues fixed in HistoChoice (VWR Life Science) were embedded in paraffin.Five-micrometer sections were labeled with rabbit anti-Pneumocystis CENP-A antibody (1:200) and mouse monoclonal anti-Pneumocystis carinii antibodies (7C4 and RAE7) (68,69).The 7C4 antibody reacts with one or two uncharacter ized antigens of 40 000 daltons and RAE7 reacts with the major surface glycoproteins of P. carinii (RAE7 antibody is a gift from Drs. Michael Linke and Peter Walzer, University of Cincinnati College of Medicine).Slides were mounted with Vector TrueView autofluorescence quenching kit with DAPI (Vector Laboratories).Alexa Fluor 555 (1:200)-labeled goat anti-rabbit IgG (A270039, Invitrogen) and Alexa Fluor 488 (1:200)-labeled goat anti-mouse (A11029, Thermo Fisher Scientific) were used as secondary antibodies.Images were acquired using either a Leica SP8 confocal microscope with 63× (1.4 NA) HC PL APO objective or a Zeiss 880 confocal microscope with a 63× (1.4 NA) PL APO objective.Images from the Zeiss were taken using 405-, 488-, and 561-nm excitation with emission bandwidths of 415-480 nm, 500-550 nm, and 565-700 nm.Images from the Leica were taken using 405-, 488-, and 555-nm excitation with emission bandwidths of 409-491 nm, 504-550 nm, and 560-721 nm.All images were taken with a format of 1,024 × 1,024 pixels with pixel size ranging from 69 to 73 nm and a line average of 2. Interslice distance for z-stack imaging was set to 160 nm and 250 nm with pinhole settings of 0.7 airy units and 1 airy unit of images acquired on the Leica and Zeiss microscopes, respectively.Image deconvolution was performed assuming an idealized point spread function using Hyugens software program (Scientific Volume Imaging, Netherlands).The software program Imaris (Oxford Instruments, Abingdon UK) was used for image visualization.
S. pombe cells were grown overnight at 30°C in minimal medium PMG lacking leucine until logarithmic phase and then mounted in PMG 2% agarose as described (70).Images were acquired on a Delta Vision Elite microscope (Applied Precision) with a 100 × 1.35 NA oil lens (Olympus).Twenty 0.35-µm z sections were acquired generating maximum intensity projections.Further image processing including background-sub tracted centromere foci intensity measurements was performed using ImageJ (National Institutes of Health).

ChIP-Seq
Five independent culture experiments with three time points (days 0, 7, and 14) were performed for each species.Purified cells (~5 × 10 6 cells) and tissue preparations were fixed by 37% formaldehyde treatment.Chromatin preparation, immunoprecipitation (IP), and Illumina sequencing libraires were prepared using the Low Cell ChIP-Seq Kit (Active Motif, Carlsbad, CA) according to the manufacturer instructions.Pre-immune sera served as negative controls (Input).ChIP-Seq libraries were sequenced commercially (Psomagen Rockville, Maryland, USA) using Illumina NovaSeq 6000 (150-base paired end reads).
Because CENP-A is expected to produce sharp peaks, CENP-A ChIP-Seq peak calling was performed using MACS3 v3.0.0a5 (Model-Based Analysis for ChIP-Sequencing) (80) with the "callpeak" function with estimated genome sizes and sorted by fold enrichment (cut off 1.2) and FDR values (broad cutoff 0.1; FDR < 0.05) (raw data are provided in Supplementary data set 4).Enriched peaks were further inspected using deepTools bamCompare versus shuffle non-centromeric genomic regions.Peaks were further examined in conjunction with other data (e.g., conservation score, RNA-seq, methylation data, GC content, and repeats) with IGV (81), ensuring that peaks are present in all IP samples and absent in controls.Because H3K9me2/3 and H3K4me2 are histone modifications with broad distribution, we used SICER (Spatial-clustering method for Identification of ChIP-Enriched Regions) (82) with default parameters for peak calling (raw data are provided in Supplementary data set 4).
Gene located in centromeres were identified from NCBI annotated GFF3 and converted in BED format using custom scripts.Overlaps between gene regions and centromere regions in BED format were identified using BEDtools intersect.

Bisulfite sequencing
Pneumocystis DNA (5 µg) was extracted by a Pneumocystis DNA enrichment protocol (14).Bisulfite conversion, library preparation, and sequencing (Illumina NovaSeq 150 base paired end reads) were performed commercially (Novogene).Bisulfite conversion of non-methylated DNA was performed using the EZ DNA Methylation Kit.Data analy sis was performed using BSMAP and methratio script (96).Only cytosine positions with more than five-fold coverage were considered.We used weighted methylation percentage (42), which is calculated as the number of reads supporting methylation over the number of cytosines sequenced to quantify methylation levels.Methylation data were partitioned over different genomic compartments using BEDTools (97).Statistical comparisons were performed using R (https://www.R-project.org) to compute non-para metric Mann-Whitney U test and Bonferroni correction to adjust P values for multiple comparisons.

ChIP-qPCR
CENP-A enrichment and H3 depletion relative to H4 within the centromeres were evaluated using primers specific to centromeric and non-centromeric loci (control region taken 200 kb away from centromere 1 of each species) (see Table S5 at https:// doi.org/10.5281/zenodo.10574230for primers).Primers were designed using primer3 (98) from regions visualized using the IGV genome browser.Primers with potential matches against mammal genomes were excluded using NCBI BLASTn.Dilutions of 1:10 were used based on preliminary test runs.Each reaction contains 5 µL of iTaq universal Sybr green supermix (Bio-Rad), 1.25 µL of each primer (500 nM), and 2.5 µL of DNA.All assays were run on CFX96 thermocycler (Bio-Rad).Three technical replicates were taken for each assay, and the standard errors of the mean were calculated.The PCR program was as follows: initial denaturation for 2 min at 95°C, followed by 30 cycles of 95°C for 30 seconds, 60°C for 30 seconds, and 72° for 30 seconds.Locus ΔCt values were normalized using the following formula: ( ΔΔCt ChIP -ΔCt Input; https://www.sigmaaldrich.com/US/en/technical-documents/protocol/genomics/qpcr/chip-qpcr-data-analysis), and the fold enrichment of the ChIP DNA over the input was computed as log2 (2 ΔΔCt ).The plots and statistical analyses were performed with GraphPad Prism 8.

ChIP-LC-MS
Formalin-fixed purified chromatin preparations were separately co-immunoprecipitated with anti-CENP-A, H4, and pre-immune serum (control) using the Pierce Direct Magnetic IP/Co-IP Kit (Thermo Fisher).Each IP reaction was performed separately on aliquots from the same chromatin preparation.Elution was performed using 1% formic acid.Extracts were frozen in dry ice, lyophilized for 1 hour, digested using trypsin, diluted, and injected to a Thermo Orbitrap Fusion Liquid Chromatography with tandem mass spectrometry (LC-MS/MS) to identify unique peptides.LC-MS/MS data were searched against the proteomes using Proteome Discoverer 2.4 (Thermo Fisher Scientific, Waltham, MA).Label-free quantification was analyzed based on the peak intensity of the precursor ion.

FIG 2
FIG 2 Pneumocystis CENP-A supports viability and centromere loading in a heterologous system.(A) Protein sequence alignment of full-length CENP-A cnp-1 orthologs in Schizosaccharomyces pombe, Pneumocystis murina, and P. carinii.Each Pneumocystis species only infects a single mammalian host: P. murina (infecting mice) and P. carinii (rats).The ~25 amino acid insert in Pneumocystis CENP-A N-terminal regions is unique to this genus.(B) Schematic of the plasmidshuffling assay used in C. The inviability of cnp1∆ cells is masked by the presence of cnp1 + on a ura4 + -marked plasmid.After introducing the LEU2-marked rescue plasmid, cells that have lost the ura4 + -marked plasmid are selected on medium containing FOA, leaving only those with the rescue plasmid.Note: S. cerevisiae LEU2 gene complements the S. pombe leu1-32 mutant.(C) Rescue experiments using the plasmid-shuffling assay described in B to test the ability of multicopy plasmids expressing GFP-tagged CENP-A transgenes to rescue cnp1∆ cells.Growth was assayed at indicated media using 10-fold serial dilutions.(D) Rescue experiment of fission yeast cnp1-76 thermosensitive cells using same plasmids as in C. Growth was assayed at indicated temperatures using 10-fold serial dilutions.The transgene encoding S. pombe CENP-A serves as a non-temperature-sensitive control.(E) Representative images of cen2-tetO-TdTomato strains expressing the indicated GFP-tagged CENP-A transgenes in multicopy plasmids.GFP and TdTomato fluorescence as well as brightfield images are merged.Magnified views of boxed regions are shown at the bottom.(F) Integrated fluorescence intensity of GFP foci was measured and plotted for the strains used in panel E. The median (bar) and interquartile range (error bars) are shown.n > 409.A.U., arbitrary units.(G) Anti-GFP ChIP-qPCR analysis of indicated loci for strains expressing GFP-tagged CENP-A transgenes as single-copy integrations.%IP represents the percentage of input that was immunoprecipitated.Error bars denote the SD (n = 3).For each strain, comparison of %IP between loci was performed by one-way ANOVA and Holm-Sidak test for multiple comparisons (P < 0.001).Mean values marked with letters (a or b) indicate results that are significantly different from each other.cc1&3, S. pombe centromere central core 1 & 3; cc2:ura4 + ,

FIG 4
FIG 4 Pneumocystis displays 17 centromeres.(A) Scaled ideogram of 17 chromosomal-level scaffolds (gray) of P. murina genome showing CENP-A binding regions (constricted areas).(B) CENP-A binding regions delineate putative CENs in P. murina genome.Color-coded peaks represent enrichment of immunopreci pitated DNA (IP DNA) relative to controls (Input DNA) in P. murina organisms (input subtracted).Only ChIP-Seq data from Pneumocystis cocultured with A549 and Let-1 cells for 7 days are presented.Data for the full experiment covering days 0, 7, and 14 are presented in Supplementary material.The 20-kb windows of the enrichment peak are presented.Each scaffold displays two peaks labeled M (Major) and m (minor) according to the enrichment level.(C) Scaled ideogram of 17 chromosomal-level scaffolds (gray) of P. carinii genome showing CENP-A binding regions (constricted areas).(D) CENP-A binding regions delineate putative CENs in P. carinii genome.Color-coded peaks represent enrichment of IP DNA relative to controls (Input DNA) in P. carinii organisms cocultured with A549

FIG 5
FIG 5 Centromeres are flanked by heterochromatin and contain active genes.(A) Genomic view of chromosome 2 of P. murina genome subsequently showing annotated genes (directed gray boxes), DNA repeats (not present), percent GC content (blue), ChIP-Seq read coverage distribution [bins per million mapped reads (BPM) normalized over bins of 50 bp; input subtracted] of CENP-A, histone H3 and H4 ratio, heterochromatin-associated modifications (H3K9me2 and H3K9me3), and euchromatin (H3K4me2) and gene expression (RNA-seq) in relation with centromeres.(B) Genomic view of chromosome 2 of P. carinii genome with the same features presented as P. murina.A duplicated copy of copia-retrotransposon is presented (red arrows), which is present in syntenic regions in other Pneumocystis genomes.The two presented chromosomes are syntenic.

FIG 6
FIG 6 Centromere locations, not sequences, are evolutionarily constrained by positive selection.(A) Genome organization and synteny in Pneumocystis.Circos plots depicting pairwise P. carinii and P. murina genome synteny.Rodent-infecting Pneumocystis (P.carinii and P. murina) have 17 chromosomes.Colored connectors indicate regions of synteny between species.Centromeres that overlap with recent chromosomal breakpoints are indicated (*).The square highlights P. carinii centromere 4 displayed in panel B. (B) Genome view of centromere 4 in the P. carinii genome.Genes are represented by directed boxes (gray for

FIG 7
FIG7 Model of regional centromere structures in Pneumocystis and other representative fungi.On the left is the phylogeny of selected fungi inferred from maximum likelihood phylogenetic analysis of shared core protein orthologs.The overall centromere structures for Pneumocystis carinii (details supporting our model are provided in the text), Schizosaccharomyces pombe(46), Candida albicans(7), Zymoseptoria tritici(44), and Cryptococcus neoformans (9) are used as

TABLE 1
Coordinates, length, and GC content (in %) of Pneumocystis centromeres identified by direct ChIP