|National Center for Biotechnology Information
Programs and Activities
NCBI has a multi-disciplinary research group composed of computer
scientists, molecular biologists, mathematicians, biochemists, research
physicians, and structural biologists concentrating on basic and
applied research in computational molecular biology. These investigators
not only make important contributions to basic science but also
serve as a wellspring of new methods for applied research activities.
Together they are studying fundamental biomedical problems at the
molecular level using mathematical and computational methods. These
problems include gene organization, sequence analysis, and structure
prediction. A sampling of current research projects includes: detection
and analysis of gene organization, repeating sequence patterns,
protein domains and structural elements, creation of a gene map
of the human genome, mathematical modeling of the kinetics of HIV
infection, analysis of effects of sequencing errors for database
searching, development of new algorithms for database searching
and multiple sequence alignment, construction of non-redundant sequence
databases, mathematical models for estimation of statistical significance
of sequence similarity, and vector models for text retrieval. Additionally,
NCBI investigators maintain ongoing collaborations with several
institutes within the NIH and also with numerous academic and government
Databases and Software
NCBI assumed responsibility for the GenBank DNA sequence database
in October 1992. NCBI staff with advanced training in molecular
biology build the database from sequences submitted by individual
laboratories and by data exchange with the international nucleotide
sequence databases, European Molecular Biology Laboratory (EMBL)
and the DNA Database of Japan (DDBJ). Arrangements with the U.S.
Patent and Trademark Office enable the incorporation of patented sequence
In addition to GenBank, NCBI supports and distributes a variety
of databases for the medical and scientific communities. These include
the Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling
Database (MMDB) of 3D protein structures, the Unique Human Gene
Sequence Collection (UniGene), a Gene Map of the Human Genome, the
Taxonomy Browser, and the Cancer Genome Anatomy Project (CGAP),
in collaboration with the National Cancer Institute.
Entrez is NCBI's search and retrieval system that provides users
with integrated access to sequence, mapping, taxonomy, and structural
data. Entrez also provides graphical views of sequences and chromosome
maps. A powerful and unique feature of Entrez is the ability to
retrieve related sequences, structures, and references. The journal
literature is available through PubMed, a Web search interface that
provides access to over 11 million journal citations in MEDLINE and
contains links to full-text articles at participating publishers'
BLAST is a program for sequence similarity searching developed at
NCBI and is instrumental in identifying genes and genetic features.
BLAST can execute sequence searches against the entire DNA database
in less than 15 seconds. Additional software tools provided by NCBI
include: Open Reading Frame Finder (ORF Finder), Electronic PCR,
and the sequence submission tools, Sequin and BankIt. All of NCBI's
databases and software tools are available from the WWW or by FTP.
NCBI also has email servers that provide an alternative way to
access the databases for text searching or sequence similarity searching.
Outreach and Education
NCBI fosters scientific communication in the area of computers,
as applied to molecular biology and genetics, by sponsoring meetings,
workshops, and lecture series. A Scientific Visitors Program has
been established to foster collaborations with extramural scientists.
Postdoctoral fellow positions are available as part of the NIH Intramural
Revised: May 21, 2004.