INSDC Minimal Specifications
The INSDC has established minimal specifications that define the core technical and data requirements all member databases must meet to ensure consistent, interoperable, and reliable sharing of nucleotide sequence data.
Formalized by the INSDC Implementation Committee in 2025, these specifications harmonize historical standards and validation practices with modern data generation and use. They address past variations in specification implementation by clearly defining a shared data model and minimum criteria for data inclusion.
The specifications support both existing and future participants, align with the goals of the INSDC Global Participation Initiative, and are intended as living documents that evolve with technological advances, user needs, and the expanding diversity of the global sequence data community.
Defined minimal specifications
Secondary data files that are derived from computational workflows or provide supplemental biological context to INSDC nucleotide data.
Description and location of the biological features in the cited sequence.
Description of the set of assembled sequences and related metadata that identifies the chromosomes, unlocalized sequences, and unplaced sequences that represent a genome.
Sequence read files without base level qualities provided in INSDC specified formats.
Collected sequencing experimental information (methods, platform, library, and machine configuration).
A predetermined set of metadata requirements for sample registration.
A collection of biological data and description of research effort related to a single initiative.
Sequence read files containing base level information provided in INSDC specified formats.
The description of the biological source materials used in experimental assays.
Assembled nucleotide sequences.