A. We plot, as a function of number of supporting reads, the fraction of junctions 1) matching GT-AG, the splice site consensus sequences (black), 2) matching a control pair of dinucleotides (grey), 3) annotated in EST databases (light blue), or 4) annotated in gene databases (dark blue). B. We split all junctions into those that are annotated in gene model databases and those that are not. Plotted is the cumulative number of junctions of each type by expression level. Unannotated junctions are expressed at much lower levels than annotated junctions. C and D. Alternative splice junctions near known protein-coding junctions show a periodic pattern. At each alternatively-spliced protein-coding 3′ or 5′ splice site, we counted the positions of AG (or GT, respectively) dinucleotides used as alternative splice sites, then averaged this across splice sites (see ). The red points denote positions that are a multiple of three base pairs from the major splice form, and the black points those that are not. The blue box below each panel shows the position of the exon.