Caenorhabditis elegans gene col-33, encoding COLlagen structural gene.
SUMMARY back to top
Cuticle and basement membrane collagens are extracellular matrix components encoded by a family of about 160 genes known to be expressed to which this gene belongs. Collagens have short interrupted blocks of Gly-X-Y sequence flanked by conserved cysteine residues, akin to vertebrate fibril-associated collagens with interrupted triple helix, and form trimers or higher order polymers. They have been grouped into five main subfamilies. The Caenorhabditis elegans cuticle is a complex multilayered extracellular matrix, consisting predominantly of cuticle collagens and synthesised by the underlying epidermal cell layer (called hypodermis). It is secreted five times during development, in embryos and before each molt, and is slightly different from stage to stage. During cuticle synthesis, the genes are expressed in a distinct temporal series, and the temporal groups contribute distinct discrete substructure of the extracellular matrix (McMahon et al, 2003). For a small number of collagen genes, with no distinctive sequence feature, but certainly critical to assembly or function of the extracellular matrix, such as dpy-2, 3, 7, 8, 10, 5 or 13, sqt-3, lon-3, bli-1, bli-2 or ram-4, loss of function causes a change in body shape (dumpy, squat, long, blistered), or leads to animals that roll when moving (helically twisted), or to male ray morphology defects. Some collagens that participate in the inner basement membranes such as let-2, emb-9, cle-1, mec-5 or unc-122 are essential for viability, or play critical roles in synaptogenesis or synaptic transmission, muscle attachment, cell migration and process guidance. But most other collagens probably have a redundant role, since loss of their function is apparently wild type, and alleles with visible effects in these genes are gain of function mutations. [Main specialists: Jim Kramer and Iain Johnstone][Wormbase] col-33 encodes a predicted cuticular collagen; by homology, col-33 is predicted to play a role in cuticle biosynthesis and regulation of body size and morphogenesis; however, as loss of col-33 activity via large-scale RNAi screens does not result in any obvious abnormalities, the precise role of COL-33 in C. elegans development and/or behavior is not yet known.
Wormbase predicts one model, but Caenorhabditis elegans cDNA sequences in GenBank, dbEST, Trace and SRA, filtered against clone rearrangements, coaligned on the genome and clustered in a minimal non-redundant way by the manually supervised AceView program, support at least 8 spliced variants

AceView synopsis, each blue text links to tables and details
Expression: According to AceView, this gene is expressed at high level, 2.4 times the average gene in this release. See the in situ hybridization pattern in Kohara NextDB. The sequence of this gene is defined by 5 cDNA clones and 26 elements defined by RNA-seq, some from mixed (seen 2 times).
Alternative mRNA variants and regulation: The gene contains 11 distinct introns (10 gt-ag, 1 gc-ag). Transcription produces at least 9 different mRNAs, 8 alternatively spliced variants and 1 unspliced form. The mRNAs appear to differ by presence or absence of 8 cassette exons, overlapping exons with different boundaries, splicing versus retention of 2 introns. 532 bp of this gene are antisense to spliced gene anticol-33artefact, raising the possibility of regulated alternate expression.
Function: There are 3 articles specifically referring to this gene in PubMed. In addition we point below to 2 abstracts. Proteins are expected to have molecular function (structural constituent of cuticle). No phenotype has yet been reported to our knowledge: this gene's in vivo function is yet unknown.
Protein coding potential: 2 spliced mRNAs putatively encode good proteins, altogether 2 different isoforms (1 complete, 1 partial), some containing domains collagen triple helix repeat, nematode cuticle collagen, N-terminal [Pfam]. The remaining 7 mRNA variants (6 spliced, 1 unspliced; 7 partial) appear not to encode good proteins.

Please quote: AceView: a comprehensive cDNA-supported gene and transcripts annotation, Genome Biology 2006, 7(Suppl 1):S12.
Map on chromosome IV, links to other databases and other names
Map: This gene col-33 maps on chomosome IV at position +0.08 (interpolated). In AceView, it covers 1.93 kb, from 4263291 to 4265217 (WS190), on the direct strand.
Links to: WormBase, NextDB, RNAiDB.
Other names: The gene is also known in Wormgenes/AceView by its positional name 4F29, in Wormbase by its cosmid.number name F36A4.6, in NextDB, the Nematode expression pattern database, as CEYK5954.
          Complete gene on genome diagram: back to top
Please choose between the zoomable GIF version., and the HTML5/SVG version.
This diagram shows in true scale the gene on the genome, the mRNAs and the cDNA clones.
Compact gene diagram back to top
Gene col-33 5' 3' encoded on plus strand of chromosome IV from 4,263,335 to 4,265,217 a b c d e f g h i 500bp 0 70 bp exon 70 bp exon 551 bp [gt-ag] intron 2 GenBank accessions 63 bp exon 317 bp [gt-ag] intron 2 GenBank accessions 81 bp exon 52 bp [gt-ag] intron 2 GenBank accessions 793 bp exon 22 accessions, some from mixed (seen 2 times) Validated 3' end, 57 accessions 793 bp exon 120 bp exon 120 bp exon 234 bp [gt-ag] intron 1 GenBank accession 1 accession 92 bp exon 49 bp exon 49 bp exon 1293 bp [gt-ag] intron 1 GenBank accession 1 accession 155 bp exon 120 bp exon 120 bp exon 326 bp [gt-ag] fuzzy intron 2 GenBank accessions 2 accessions 63 bp exon 107 bp exon 107 bp exon 339 bp [gc-ag] intron 1 GenBank accession 1 accession 63 bp exon 49 bp exon 49 bp exon 1722 bp [gt-ag] intron 1 GenBank accession 148 bp exon 1 accession 148 bp exon 49 bp exon 49 bp exon 1527 bp [gt-ag] intron 1 GenBank accession 1 accession 53 bp exon 49 bp exon 49 bp exon 1455 bp [gt-ag] intron 1 GenBank accession 33 bp exon 1 accession 33 bp exon 49 bp exon 49 bp exon 1704 bp [gt-ag] intron 1 GenBank accession 18 bp exon 1 accession 18 bp exon Alternative mRNAs are shown aligned from 5' to 3' on a virtual genome where introns have been shrunk to a minimal length. Exon size is proportional to length, intron height reflects the number of cDNAs supporting each intron, the small numbers show the support of the introns in deep sequencing (with details in mouse-over) . Introns of the same color are identical, of different colors are different. 'Good proteins' are pink, partial or not-good proteins are yellow, uORFs are green. 5' cap or3' poly A flags show completeness of the transcript.
Sequences: click on the numbers to get the DNA back to top
mRNA variant mRNA matching the genome Best predicted protein 5' UTR 3' UTR Upstream sequence Transcription
Downstream sequence
a 1007 bp 304 aa 43 bp 49 bp 2kb probably including promoter 1927 bp 1kb
b 212 bp 70 aa 2kb probably including promoter 446 bp 1kb
c 204 bp 59 aa 27 bp 2kb probably including promoter 1497 bp 1kb
d 183 bp 61 aa 42 bp 2kb probably including promoter 509 bp 1kb
e 170 bp 56 aa 2kb probably including promoter 509 bp 1kb
f 197 bp 51 aa 43 bp 2kb probably including promoter 1919 bp 1kb
g 102 bp 25 aa 27 bp 2kb probably including promoter 1629 bp 1kb
h 82 bp 18 aa 27 bp 2kb probably including promoter 1537 bp 1kb
i 67 bp 13 aa 27 bp 2kb probably including promoter 1771 bp 1kb

Gene neighbors and Navigator on chromosome IV back to top
ttr-20 C ama-1 D C I R P col-33 C P 4F31 C 4F33 C 4F35 C 4F37 C 4F39 C I P 4F3 4F10 P D C P col-34 D C P 4F44 D C P nhr-78 C I P srv-4 5kb 0 ttr-20, 13 accessions ama-1, 54 accessions, 5 variants col-33, 31 accessions 9 variants 4F31, 6 accessions, 2 variants 4F33, 16 accessions 4F35, 24 accessions, 2 variants 4F37, 6 accessions, 2 variants 4F39, 20 accessions mir-242, 1 accession 4F3, 0 accession 4F43, 19 accessions, 2 variants 4F45, 23 accessions 4F2, 2 accessions 4F4, 1 accession 4F10, 0 accession 4F6, 3 accessions col-34, 113 accessions 4 variants 4F44, 41 accessions, 9 variants nhr-78, 0 accession srv-4, 8 accessions, 2 variants ZOOM OUT                 D:disease, C:conserved, I:interactions, R:regulation, P:publications         Read more...
Annotated mRNA diagrams back to top
Bibliography back to top
Please see these 3 articles in PubMed.
In addition we found 2 papers for which we do not have a PubMed identifier
? Gene Summary Gene on genome mRNA:.a, .b, .c, .d, .e, .f, .g, .h, .i Alternative mRNAs features, proteins, introns, exons, sequences Expression Tissue Function, regulation, related genes C

To mine knowledge about the gene, please click the 'Gene Summary' or the 'Function, regulation, related genes ' tab at the top of the page. The 'Gene Summary' page includes all we learnt about the gene, functional annotations of neighboring genes, maps, links to other sites and the bibliography. The 'Function, regulation, related genes ' page includes Diseases (D), Pathways, GO annotations, conserved domains (C), interactions (I) reference into function, and pointers to all genes with the same functional annotation.
To compare alternative variants, their summarized annotations, predicted proteins, introns and exons, or to access any sequence, click the 'Alternative mRNAs features' tab. To see a specific mRNA variant diagram, sequence and annotation, click the variant name in the 'mRNA' tab. To examine expression data from all cDNAs clustered in this gene by AceView, click the 'Expression tissue'.

If you know more about this gene, or found errors, please share your knowledge. Thank you !