1. Bioinformatics Quick Start
2. Making Sense of DNA and Protein Sequences
3. Unmasking Genes in Human DNA
4. Identification of Disease Genes
5. Correlating Disease Genes and Phenoypes
6. BLAST Quick Start
7. EntrezGene Quick Start
8. Structure Analysis Quick Start
9. MapViewer Quick Start
10. GenBank Quick Start
11. Entrez Quick Start
12. Microbial Genomes Quick Start
Suggested Biology Reading
Preparatory Mini-course Reading
NCBI bioinformatics mini-courses are either problem based, such as "Identification of Disease Genes" or
NCBI resource based such as "BLAST Quick Start". The courses are 2.5 hours in length with first hour and half devoted to an overview
that is followed by a one hour
hands-on session. Contact Medha Bhagwat if
you have any questions or comments about the mini-courses.
The course provides an introduction to aspects of bioinformatics such as accessing, analyzing, and interpreting
biological data using NCBI databases and tools. An analysis of the animal photoreceptor family is used to illustrate
practical bioinformatics approaches to the study of sequence similarity, phylogenetic analysis, gene expression, homology, polymorphisms, 3-D structure and function.
This course will be useful for non-biologists as well as biologists.
In this mini-course, we will find a gene within a eukaryotic DNA sequence. We will then predict the function of the
implied protein product by seeking sequence similarities to
proteins of documented function using BLAST and other tools. Finally, we will
find a 3D modeling template for this protein sequence using a Conserved Domain Database Search.
This mini-course describes how to combine the output of multiple prediction programs to find genes, promoters and other transcription-factor binding sites in human DNA sequences. To illustrate the method, an instructional program called Greengene will be used to integrate the output of several gene-finding tools. Greengene also allows a coding sequence and accompanying protein translation to be assembled from the exons detected by these programs. Because the output of several programs is integrated, exon selection is more reliable.- Developed by Medha Bhagwat and David Wheeler
This mini-course deals with the identification of a disease gene using NCBI's human genome assembly. The reference genome assembly, along with integrated maps, literature, and expression information comprises a powerful discovery system for exploring candidate human disease genes. We will start with EST sequences obtained from a patient, identify the gene(s) expressing them, download their sequences, determine the exon-intron structure and identify known SNPs in the ESTs, if any, that may contribute to the disease phenotype.- Developed by Medha Bhagwat
We will learn to determine what is known about a disease and the gene associated with it. We will then elucidate the biochemical and structural basis for the phenotype caused by the mutant protein.- Developed by Medha Bhagwat
A practical introduction to the BLAST family of sequence-similarity search programs. Exercises range from simple searches to creative uses of the BLAST programs to perform specialized searches.- Developed by Medha Bhagwat and David Wheeler
NCBI's Entrez Gene provides gene-based information such as chromosome location, sequence, expression, structure, function, and homology data. Each record represents a single gene from an organism. Entrez Gene includes organisms for which there is a RefSeq genome record.
In this course, we will learn how to obtain information about a human gene such as its mRNA and genomic sequence, gene structure (exon-intron locations), function and phenotypes associated with mutations. We will also learn how to determine whether the SNPs in the coding region of a gene are known to alter the function of the protein product .
Entrez Gene is the successor to LocusLink. The mini-course will cover the use of Entrez Gene to obtain the same information as was found in LocusLink. The course will also cover the advantages of Entrez Gene such as efficient searching options and availability of gene-specific information for all completely sequenced genomes, including bacteria and viruses.- Developed by Medha Bhagwat
This course covers how to visualize and annotate 3D protein structures using NCBI's Cn3D, identify conserved domain(s) present in a protein, seach for other proteins containing similar domain(s), explore a 3D modeling template for the query protein and find distant sequence homologs that may not be identified by BLAST.- Developed by Medha Bhagwat
NCBI's MapViewer can be used to visualize an organism's genome. The organisms represented in
the Map Viewer include human, mouse, rat, zebrafish, mosquito,
nematode, fruit fly, yeast, arabidopsis and others.
GenBank is a repository of nucleotide sequences from about 160,000 organisms.
This course begins with a survey of different types of entries. Using a typical GenBank entry as a model, students will learn to understand the features annotated on it.
The course will also cover how to submit sequences to GenBank and include an overview of the processing
of the entries. Finally, students will learn how to efficiently search GenBank and download sequences.
Entrez is the integrated, text-based search and retrieval system used at NCBI for the major databases,
including PubMed, Nucleotide and Protein Sequences, Expression, PubChem (biological activities of small molecules),
Protein Structures, Complete Genomes, Taxonomy, and others.
Entrez provides links to related records within the database and between other databases in Entrez.
Click here for a more detailed view
that illustrates the links existing among various Entrez Databases.
This course has been discontinued, and the materials are no longer available.