Vertebrate Nipa1 and Nipa2 gene structures. a, Schematic genomic structure of NIPA1 in human, mouse and Fugu. Coding regions (shaded rectangles) and untranslated (5′ UTR and 3′ UTR) regions (open rectangles) are shown. The horizontal black bars are CpG islands, the arrows below the exons (e) are primers used for RT-PCR, and the vertical dotted lines are functional polyA signals (predicted in Fugu). Alternative polyadenylation generates two different 3′ ends for human and mouse Nipa1 (with distance between the polyA signals shown). b, Highly conserved 3′ ends of the mammalian NIPA1 7.5-kb mRNA sequences from five eutherian species, aligned by CLUSTAL W. Black nucleotides with gray background agree with the consensus, and polyA signals are shown as white letters on black background. GenBank accession numbers for the NIPA1 3′ ESTs are as follows: human, BF439642; pig, BI339387; cow, BE685351; mouse, BE946294; and rat, AW533027. c, Genomic structure of orthologous human, mouse, and Fugu NIPA2 genes, as well as conserved intron placement in ancestral genes from Drosophila and Anopheles. Symbols as for panel a. The shaded box in exon 1 represents uORF1, but putative 5′ noncoding exons in Fugu (dotted line) have not been identified. d, Conserved ORFs in the 5′ UTR (exons 1–3) of the vertebrate NIPA2 transcripts. White letters on black background represent sequences conserved in all species shown, black letters with gray background have one mismatch, and the initiation codons of NIPA2 and uORF1 have bold white letters on black background. Also shown are exon (ex) positions for the mouse (mu) gene, and the stop (TGA or TAA) codons for the uORF1. GenBank accession numbers for the NIPA2 5′ ESTs are as for figure 4b,, and the GenBank accession number for dog is BM538411. Alignments were generated with CLUSTAL W. e, Amino acid sequence of the putative exon 1 uORF1 from human, mouse, cow, dog, chicken (uORF2), and Xenopus NIPA2 mRNAs. White letters on black background represent sequences conserved by comparison with the mammalian consensus, and black letters with gray background represent sequences conserved in fewer species, and italics represent chemically similar amino acids. GenBank accession numbers are as for fig. 4b, and the GenBank accession number for chicken is AJ452290.