Send to

Choose Destination
See comment in PubMed Commons below
Int J Med Inform. 2010 Feb;79(2):130-42. doi: 10.1016/j.ijmedinf.2009.11.003. Epub 2009 Dec 6.

Genomic Sequence Variation Markup Language (GSVML).

Author information

Information Center for Medical Sciences, Tokyo Medical and Dental University, Bunkyo, Tokyo, Japan.



With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics.


We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed.


We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated.


GSVML was developed as a potential data exchanging format for genomic sequence variation data exchange focusing on human health applications. The international standardization of GSVML is necessary, and is currently underway. GSVML can be applied to enhance the utilization of genomic sequence variation data worldwide by providing a communicable platform between clinical and research applications.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science
    Loading ...
    Support Center