The structure and organization of a proline-rich protein gene of a mouse multigene family

J Biol Chem. 1985 Dec 15;260(29):15863-72.

Abstract

One gene of the mouse proline-rich protein multigene family was cloned on a 3.6-kilobase pair EcoRI/BglII DNA fragment from a (partial) Sau3A bacteriophage library of CD-1 mouse chromosomal DNA. Phage harboring the gene were identified by plaque hybridization using 32P-labeled proline-rich protein cDNA inserts from clones pRP33 and pMP1 obtained from rat and mouse, respectively. The transcriptional unit includes three exonic sequences separated by 1434 base pairs (intron I) and 450 base pairs (intron II). The complete primary structure of the gene and the 5' and 3' flanking regions (3595 base pairs) were determined by the Maxam and Gilbert (Maxam, A.M., and Gilbert, W. (1980) Methods Enzymol. 65, 499-560) sequencing method. The DNA on the 5' side of exon I contains several sequences that may be involved in the induction and expression of this mouse gene. These sequences include putative regulatory sites such as those considered to be inducible by cAMP and steroids, Z-DNA and enhancer sequences and the expected TATAA and CAAT boxes. The mature protein coding region, exon II, is not interrupted with intron sequences. Exon III is located in the nontranslated region and contains the poly(A) addition site. The deduced amino acid sequence showed that the protein encoded by this gene contains 13 tandemly repeat regions, each 14 amino acids in length, with the prototype sequence PPPPGGPQPRPPQG. Each amino acid within the repeat has a favored codon. The consensus DNA sequence for each repeat is CCA CCA CCA CCA GGA GGC CCA CAG CCG AGA CCC CCT CAA GGC. The high degree of conservation of both nucleotide and amino acid sequences within the repeat region suggests that proline-rich protein genes likely evolved by gene duplication of a 42-base pair internal repeat.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • DNA / analysis*
  • DNA Restriction Enzymes / metabolism
  • Deoxyribonuclease BamHI
  • Deoxyribonuclease EcoRI
  • Deoxyribonuclease HindIII
  • Mice
  • Molecular Weight
  • Nucleic Acid Hybridization
  • Peptides / genetics*
  • Proline-Rich Protein Domains
  • Rats

Substances

  • Peptides
  • DNA
  • DNA Restriction Enzymes
  • Deoxyribonuclease BamHI
  • Deoxyribonuclease EcoRI
  • Deoxyribonuclease HindIII

Associated data

  • GENBANK/M12099
  • GENBANK/M12100