What is Biocollections?

BioCollections is a curated dataset of metadata for culture collections, museums, herbaria and other natural history collections, including Darwin Core institution and collection codes, and URL formulae for mapping specimen ids to web pages at the collection site.
Biocollections stores acronyms used in “structured vouchers” for sequence entries submitted to the International Nucleotide Sequence Database (INSDC)(GenBank, European Nucleotide Archive (ENA), and DNA Databank of Japan (DDBJ)) and NCBI’s BioSample.

What are the Sources?

Collection metadata was imported from online directories of specimen repositories such as Index Hebariorum, World Federation of Culture Collections (CCINFO), Insect and Spider Collections of the World, Amphibian Species of the World (AMNH), collections abbreviation in the Catalog of Fishes (CAS) and other biorepositories published in professional journals.  Additional repository records are made for collections for which sequence data is submitted to INSDC.  These new collections are validated to ensure that they are curated collection prior to inclusion in the database. Other directories of repositories are periodically reviewed, to ensure that the NCBI Biocollections database is up-to-date.

Which type of “structured vouchers” should be used to submit data to INSDC?
Specimens identifiers stored in collections should be annotated using one of the following source qualifiers

   frozen tissue collection
   Microbial culture collection
   cell lines
   botanical garden         <- live plants
   DNA bank
   stock center
   germplasm repository
   seed bank

What is the proper format for the “structured voucher”?

The Darwin Core triplet is used for “structured voucher” <institution_code>:<OPTIONAL collection_code>:<specimen_id>

For example:





/bio_material="K:MWC 3856"

/bio_material="USDA:GRIN:PI 588454-b"

More information:


What is Collection code?

Sometimes there are several collections within an institution. Collection codes are acronyms used for collections within institutions.

For example, we list UAM as institution code for University of Alaska Museum. Within the museum there are several collections like mammals, fish, insects etc. with collection codes Mamm, Fish, Ento respectively.




How are duplicated Institution codes treated?

When more than one institution uses the same acronym for their specimen, the ISO (International Organization for Standardization) three letter country code is used to unique the collections. The acronym that is already in the database retains the acronym (without the country code) and the subsequent ones are registered with three letter country codes.

For example, all the following Institutions use UAM as an acronym for their collections. In order to distinguish between the collections, the institution codes are listed as:

University of Alaska, Museum of the North UAM
University of Arkansas at Monticello UAM<USA-AR>
University of Alabama, Malacology Collection UAM<USA-AL>
Universidad Autonoma De Madrid culture collection of cyanobacteria UAM<ESP>
Universidad de los Andes, Facultad de Ciencias UAM<VEN>

What happens when one Institution use several Institution codes?

For various reasons some institutions use more than one Institution code. For example, University of Maryland uses MARY for its herbarium collection and UMDC for its museum collection. These are listed as separate records.

If the institution changes the acronym for their collection institution code and adopts a new one, old acronym is retained in the database as a synonym.

USNM  - National Museum of Natural History, Smithsonian
NMNH - National Museum of Natural History, Smithsonian

NMNH is listed as an Institution synonym for USNM

Are the specimen in personal collections listed in the Biocollections database?

At present, personal collections are not listed in the database. However, personal and private collections can be annotated in the INSDC entries as:

/specimen_voucher="personal:Antonio Machado:AMC 3410
/specimen_voucher="personal:Dan Janzen:05-SNRP-981


To register a collection or for more information, please send an email to gb-admin@ncbi.nlm.nih.gov.

Support Center

Last updated: 2016-09-26T16:46:38-04:00