Venn analysis as part of a bioinformatic approach to prioritize expressed sequence tags from cardiac libraries

Clin Biochem. 2004 Nov;37(11):953-60. doi: 10.1016/j.clinbiochem.2004.07.010.

Abstract

Objectives: We needed to sort expressed sequence tags (ESTs) from human cardiac expression libraries.

Design and methods: We annotated DNA sequence text files of 35,152 cardiac ESTs using our search and annotation tool called Multiblast.pl. We generated lists of the most prevalent ESTs in each library, and using a novel Venn tool, we grouped ESTs that were common to all or exclusive to particular libraries.

Results: Hypothetical protein KIAA0553 was expressed 120 times among 917 ESTs from an adult cardiac library (13.1%) compared only once among 8075 ESTs from fetal cardiac libraries (P < 10(-114)), this was confirmed using Northern analysis. We collated biochemical features of KIAA0553 and determined DNA polymorphism frequencies. We also used the Venn tool to specify genes that were uniquely expressed in hypertrophic cardiomyocytes.

Conclusions: Annotating ESTs and sorting them using Venn analysis can help specify new candidate disease genes from the current lists of "hypothetical proteins".

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Cardiovascular Diseases / genetics
  • Computational Biology*
  • Expressed Sequence Tags / metabolism*
  • Fetal Heart / metabolism
  • Gene Library*
  • Genome, Human
  • Genomics*
  • Humans
  • Mice
  • Molecular Sequence Data
  • Muscle Proteins / genetics*
  • Muscle Proteins / metabolism
  • Myocardium / metabolism*
  • Polymorphism, Single Nucleotide
  • Rats
  • Sequence Alignment

Substances

  • GPATCH8 protein, human
  • Muscle Proteins