Send to

Choose Destination
FEMS Immunol Med Microbiol. 1998 Sep;22(1-2):15-26.

The genome of Pneumocystis carinii.

Author information

Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, OH 45267-0560, USA.


The best understood special form of P. carinii, P. carinii formae specialis (f.sp.) carinii, appears to be haploid and contains about 8 million base pairs of DNA (8.5 fg) per nucleus. The genome of P. carinii f.sp. carinii is divided into 13-15 linear chromosomes that range from 300 to 700 kb in size. Eight different P. carinii f.sp. carinii karyotypes have been observed. The karyotypes of P. carinii f.sp. carinii differ due to slight variations in the lengths of chromosomes, but the 8 karyotype-forms of P. carinii f.sp. carinii exhibit very little variation in DNA sequence. By contrast, the genome of P. carinii f.sp. carinii differs markedly in sequence from the genomes of P. carinii from other hosts, such as mouse, ferret and human. In addition, chromosomes and DNA sequences from P. carinii from mouse, ferret, and human also differ greatly from each other. The genome of a ferret P. carinii appears to be up to 1.7 times larger than those of P. carinii from other hosts. Nearly two dozen P. carinii genes have been cloned and sequenced. The typical P. carinii gene sequence is 60-65% A+T. P. carinii genes usually contain introns, which are typically less than 50 bp in length, but can be as numerous as 9 per gene. A system for naming P. carinii genes is proposed in which each gene would be designated by an italic three-letter lower case symbol. The first allele (i.e. sequence) that is found would have a superscript 1, such as xyz1(1). Any subsequent alleles would be designated as xyz1(2), etc. A protein would have the same symbol as the gene that produced it, but written in roman print with the first letter an uppercase, such as Msg1. Some of the P. carinii genome is comprised of DNA sequences that are present dozens of times. Three families of such repeated DNA sequences have been described. Two of these families (MSG and PRT) encode proteins. The third family is the telomere repeat, which is found at the ends of each chromosome, and sometimes at internal chromosomal sites, in which case it has been called the alpha repeat. Determination of the complete sequence of the P. carinii genome is both practicable and of primary importance to the understanding of this organism. The small size of the P. carinii genome and its packaging into chromosomes that are resolvable by PFGE will facilitate sequence analysis.

[Indexed for MEDLINE]
Free full text

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center