MEDLINE Data


Introduction
Structure of a MEDLINE Entry
MeSH Index Terms
Substance Records
Database Cross Reference Records
Funding Identifiers
Gene Symbols
ASN.1 Specification: medline.asn
C Structures and Functions: objmedli.h


 Introduction

MEDLINE is the largest and oldest biomedical database in the world. It is built at the National Library of Medicine (NLM), a part of NIH. At this writing it contains over seven million citations from the scientific literature from over 3500 different journals. MEDLINE is a bibliographic database. It contains citation information (e.g. title, authors, journal, etc.). Many entries contain the abstract from the article. All articles are carefully indexed by professionals according to formal guidelines in a variety of ways. All entries can be uniquely identified by an integer key, the MEDLINE unique identifier (MEDLINE uid).

MEDLINE is a valuable resource in its own right. In addition, the MEDLINE uid can serve as a valuable link between entries in factual databases. When NCBI processes a new molecular biology factual database into the standardized format, we also normalize the bibliographic citations and attempt to map them to MEDLINE. For the biomedical databases we have tried thus far, we have succeeding in mapping most or all of the citations this way. From then on, linkage to other data objects can be made simply and easily through the share MEDLINE uid. The MEDLINE uid also allows movement from the data item to the world of scientific literature in general and back.

Structure of a MEDLINE Entry

Each Medline-entry represents a single article from the scientific literature. The MEDLINE uid is an INTEGER which uniquely identifies the entry. If corrections are made to the contents of the entry, the uid is not changed. The MEDLINE uid is the simplest and most reliable way to identify the entry.

The entry-month is the month and year in which the entry became part of the public view of MEDLINE. It is not the same as the date the article was published. It is mostly useful for tracking what is new since a previous query of MEDLINE.

The article citation itself is contained in a standard Cit-art, imported from the bibliographic module, so will not be discussed further here. The entry often contains the abstract from the article.  The rest of the entry consists of various index terms, which will be discussed below.

The C implementation of a MedlineEntry is straightforward.

MeSH Index Terms

Medical Subject Heading (MeSH) terms are a tree of controlled vocabulary maintained by the Library Operations division of NLM. The tree is arranged with parent terms above more specialized terms within the same concept. An entry in MEDLINE is indexed by the most specific MeSH term(s) available. Since the MeSH vocabulary is a tree, one may then query on specific terms directly, or on general terms by including all the child terms in the query as well.

A MeSH term may be qualified by one or more sub-headings. For example, the MeSH term "insulin" may carry quite a different meaning if qualified by "clinical trials" versus being qualified by "genetics".

A MeSH term or a sub-heading may be flagged as indicating the "main point" of the article. Again the most specific form is used. If the main point of the article was about insulin and they also discuss genetics, then the insulin MeSH term will be flagged but the genetics sub-heading will not be. However, if the main point of the article was the genetics of insulin, then the sub-heading genetics under the MeSH term insulin will be flagged but the MeSH term itself will not be.

Substance Records

If an article has substantial discussion of recognizable chemical compounds, they are indexed in the substance records. The record may contain only the name of the compound, or it may contain the name and a Chemical Abstracts Service (CAS) registry number or a Enzyme Commission (EC) number as appropriate.

Database Cross Reference Records

If an article cites an identifier recognized to be from a known list of biomedical databases, the cross reference is given in this field and the key for which database it was from. A typical example would be a GenBank accession number citing in an article.

Funding Identifiers

If an id number from a grant or contract is cited in the article (usually acknowledging support) it will appear in this field.

In the C structure, ValNodes are used to make a linked list of the CharPtrs to the strings.

Gene Symbols

As an experiment, Library Operations at the NLM is putting in mnemonic symbols from articles, if they appear by form and usage to be gene symbols. Obviously such symbols vary and are not always properly used, so this field must be approached with caution. Nonetheless it can provide a route to a rich source of potentially relevant citations.

ASN.1 Specification: medline.asn

--$Revision: 2.0 $

--**********************************************************************

--

--  MEDLINE data definitions

--  James Ostell, 1990

--

--**********************************************************************

 

NCBI-Medline DEFINITIONS ::=

BEGIN

 

EXPORTS Medline-entry;

 

IMPORTS Cit-art FROM NCBI-Biblio

        Date FROM NCBI-General;

 

                                -- a MEDLINE entry

Medline-entry ::= SEQUENCE {

    uid INTEGER ,               -- MEDLINE UID

    em Date ,                   -- Entry Month

    cit Cit-art ,               -- article citation

    abstract VisibleString OPTIONAL ,

    mesh SET OF Medline-mesh OPTIONAL ,

    substance SET OF Medline-rn OPTIONAL ,

    xref SET OF Medline-si OPTIONAL ,

    idnum SET OF VisibleString OPTIONAL ,  -- ID Number (grants, contracts)

    gene SET OF VisibleString OPTIONAL }

 

Medline-mesh ::= SEQUENCE {

    mp BOOLEAN DEFAULT FALSE ,       -- TRUE if main point (*)

    term VisibleString ,                   -- the MeSH term

    qual SET OF Medline-qual OPTIONAL }    -- qualifiers

 

Medline-qual ::= SEQUENCE {

    mp BOOLEAN DEFAULT FALSE ,       -- TRUE if main point

    subh VisibleString }             -- the subheading

 

Medline-rn ::= SEQUENCE {       -- medline substance records

    type ENUMERATED {           -- type of record

        nameonly (0) ,

        cas (1) ,               -- CAS number

        ec (2) } ,              -- EC number

    cit VisibleString OPTIONAL ,  -- CAS or EC number if present

    name VisibleString }          -- name (always present)

 

Medline-si ::= SEQUENCE {       -- medline cross reference records

    type ENUMERATED {           -- type of xref

        ddbj (1) ,              -- DNA Data Bank of Japan

        carbbank (2) ,          -- Carbohydrate Structure Database

        embl (3) ,              -- EMBL Data Library

        hdb (4) ,               -- Hybridoma Data Bank

        genbank (5) ,           -- GenBank

        hgml (6) ,              -- Human Gene Map Library

        mim (7) ,               -- Mendelian Inheritance in Man

        msd (8) ,               -- Microbial Strains Database

        pdb (9) ,               -- Protein Data Bank (Brookhaven)

        pir (10) ,              -- Protein Identification Resource

        prfseqdb (11) ,         -- Protein Research Foundation (Japan)

        psd (12) ,              -- Protein Sequence Database (Japan)

        swissprot (13) } ,      -- SwissProt

    cit VisibleString OPTIONAL }    -- the citation/accession number

 

END

C Structures and Functions: objmedli.h

/*  objmedli.h

* ===========================================================================

*

*                            PUBLIC DOMAIN NOTICE                         

*               National Center for Biotechnology Information

*                                                                         

*  This software/database is a "United States Government Work" under the  

*  terms of the United States Copyright Act.  It was written as part of   

*  the author's official duties as a United States Government employee and

*  thus cannot be copyrighted.  This software/database is freely available

*  to the public for use. The National Library of Medicine and the U.S.   

*  Government have not placed any restriction on its use or reproduction. 

*                                                                         

*  Although all reasonable efforts have been taken to ensure the accuracy 

*  and reliability of the software and data, the NLM and the U.S.         

*  Government do not and cannot warrant the performance or results that   

*  may be obtained by using this software or data. The NLM and the U.S.   

*  Government disclaim all warranties, express or implied, including      

*  warranties of performance, merchantability or fitness for any particular

*  purpose.                                                               

*                                                                         

*  Please cite the author in any work or product based on this material.  

*

* ===========================================================================

*

* File Name:  objmedli.h

*

* Author:  James Ostell

*  

* Version Creation Date: 1/1/91

*

* $Revision: 2.0 $

*

* File Description:  Object manager interface for module NCBI-Medline

*

* Modifications: 

* --------------------------------------------------------------------------

* Date    Name        Description of modification

* -------  ----------  -----------------------------------------------------

*

*

* ==========================================================================

*/

 

#ifndef _NCBI_Medline_

#define _NCBI_Medline_

 

#ifndef _ASNTOOL_

#include <asn.h>

#endif

#ifndef _NCBI_General_

#include <objgen.h>

#endif

#ifndef _NCBI_Biblio_

#include <objbibli.h>

#endif

 

#ifdef __cplusplus

extern "C" {

#endif

 

/*****************************************************************************

*

*   loader

*

*****************************************************************************/

extern Boolean MedlineAsnLoad PROTO((void));

 

/*****************************************************************************

*

*    Medline-mesh

*

*****************************************************************************/

typedef struct mesh {

    Boolean mp;                   /* main point */

    CharPtr term;

    ValNodePtr qual;

    struct mesh PNTR next;

 } MedlineMesh, PNTR MedlineMeshPtr;

 

extern MedlineMeshPtr MedlineMeshNew PROTO((void));

extern MedlineMeshPtr MedlineMeshFree PROTO((MedlineMeshPtr mmp));

extern MedlineMeshPtr MedlineMeshAsnRead PROTO((AsnIoPtr aip, AsnTypePtr atp));

extern Boolean MedlineMeshAsnWrite PROTO((MedlineMeshPtr mmp, AsnIoPtr aip, AsnTypePtr atp));

 

/*****************************************************************************

*

*    Medline-rn

*

*****************************************************************************/

typedef struct rn {

    Uint1 type;               

    CharPtr cit,

            name;

    struct rn PNTR next;

 } MedlineRn, PNTR MedlineRnPtr;

 

extern MedlineRnPtr MedlineRnNew PROTO((void));

extern MedlineRnPtr MedlineRnFree PROTO((MedlineRnPtr mrp));

extern MedlineRnPtr MedlineRnAsnRead PROTO((AsnIoPtr aip, AsnTypePtr atp));

extern Boolean MedlineRnAsnWrite PROTO((MedlineRnPtr mrp, AsnIoPtr aip, AsnTypePtr atp));

 

/*****************************************************************************

*

*    Medline-si

*      ValNode used for structure

*

*****************************************************************************/

 

extern ValNodePtr MedlineSiAsnRead PROTO((AsnIoPtr aip, AsnTypePtr atp));

extern Boolean MedlineSiAsnWrite PROTO((ValNodePtr msp, AsnIoPtr aip, AsnTypePtr atp));

 

/*****************************************************************************

*

*   Medline-entry

*

*****************************************************************************/

typedef struct medline {

    Int4 uid;

    DatePtr em;

    CitArtPtr cit;

    CharPtr abstract;

    MedlineMeshPtr mesh;

    MedlineRnPtr substance;

    ValNodePtr xref;

    ValNodePtr idnum;

    ValNodePtr gene;

} MedlineEntry, PNTR MedlineEntryPtr;

 

extern MedlineEntryPtr MedlineEntryNew PROTO((void));

extern MedlineEntryPtr MedlineEntryFree PROTO((MedlineEntryPtr mep));

extern MedlineEntryPtr MedlineEntryAsnRead PROTO((AsnIoPtr aip, AsnTypePtr atp));

extern Boolean MedlineEntryAsnWrite PROTO((MedlineEntryPtr mep, AsnIoPtr aip, AsnTypePtr atp));

 

#ifdef __cplusplus

}

#endif