|

|
PubMed
|
|
NLM Standard Publisher Data Format
Last Updated: February 14, 2008
This is the standard data format that publishers are required to use in
submitting citation data to NLM for processing into PubMed.
This is a tagged format, each part of a citation is preceded by an opening
<Tag> string and followed by a closing </Tag> string, where Tag is
some appropriate label. The XML tags are listed below followed by several
examples. Additional information on XML tagged format is available
at the following Web sites: Xmlu.com,
XML.com, and OASIS. For further assistance,
please send an e-mail to publisher@ncbi.nlm.nih.gov.
Note:
- This format is required for
submission of citation and abstract data to NLM. Other formats are not
acceptable. Only journals that are already approved for inclusion in PubMed should be submitted. See our Journal Submission FAQs for more information about journals indexed for MEDLINE.
- If you wish to have non-ASCII
characters in your citations you must use standard SGML
entity names. It is not possible to keep a separate translation
table for each publisher, given the number of possible non-ASCII
characters.
- Links to your Web site, if
available, may be submitted using LinkOut.
Return to Information
for Publishers re: XML-Tagged Data for additional publisher information.
The XML Tags
Data Tags (R = Required, O = Optional).
Tags are case sensitive. Required tags must be included; optional tags must be
included only if the data requested appears in the print or electronic issue.
- File Header (R) The
header information should include: <!DOCTYPE ArticleSet
PUBLIC "-//NLM//DTD PubMed 2.0//EN" "http://www.ncbi.nlm.nih.gov:80/entrez/query/static/PubMed.dtd"
>
- ArticleSet
(R) An entire submission of the set of articles for each issue. Each
issue of a given journal must be enclosed in these tags.
- Article (R) Each
article must be enclosed in these tags. Do not submit data for the
following items: book reviews, advertisements, announcements, erratum
notices, software and equipment reviews, and papers to appear in
forthcoming issues. In addition, do not submit individual
citations for abstracts or shortened versions of presentations or papers
from conference proceedings unless the full-text of the article is
published. In most instances, NLM does create a single citation to
cover a group of meeting abstracts or shortened versions of conference
proceedings; for example, see PMIDs 12526142,
12516608, and 12516600.
- Journal (R)
Bibliographic information about the journal issue contained in the file.
- PublisherName
(R) The publisher name.
- JournalTitle
(R) The standard MEDLINE abbreviation for the journal title. If you
do not know the abbreviation, see the Journals
Database.
- Issn
(R) The ISSN or ESSN of the journal.
- Volume (R)
The volume name or number of the journal, including any supplement
information, e.g., 12 Suppl 2, 514 ( Pt 2), 19 Suppl A,
etc.
- Issue (O) The
issue number, e.g., 6 Pt 2, 7-8,
etc.
- PubDate
(R) The publication date information must be enclosed in the
following date tags. NOTE: Print publication dates should accurately reflect the date format on the
cover of the journal and online publication dates should accurately
reflect the date format on the journal Web site. The PubDate tag includes the PubStatus
attribute, which may contain only one of the following values:
ppublish
- published in print (default value)
epublish - electronically published
only, never published in print
aheadofprint - electronically
published, but followed by print
The latest (current) article status with the date of this
status must be submitted in PubDate within Journal.
Which value you choose to use depends on whether the article is a print,
electronic or ahead of print article. See our page entitled Properly Coding Print, Electronic and Ahead of Print
Articles for more details.
- Year (R) The
4-digit year of publication. <Year>
can only contain 4-digit ranging between 1966 and 2010.
- Month (O)
The month of publication. <Month>
can only contain the numbers 1-12, the month (in English) or the first
three letters of the English months. NOTE: The only PubStatus attribute that allows for a dual month
in <Month> is ppublish.
- Season (O)
The season of publication (do not use if a month is available).
- Day (O) The
day of publication. <Day> can
only contain the numbers 1-31.
- Replaces (O) The identifier of the article that this one
replaces. Do not use this tag for new articles. The <Replaces> tag
can be used to update an Ahead
of Print citation, or to correct
an error in citations with [PubMed - as
supplied by publisher] status. The Replaces tag includes the IdType attribute, which may contain only one of the following values:
pmid -
PubMed ID (PMID) (default value)
pii - controlled publisher identifier
doi - Digital Object Identifier
See our Instructions
for Replacement Files for more details.
- ArticleTitle
(O) The article title, in English, if published in English or
translated to English in the journal. Do not submit this tag if the
published title is not in English or is not translated to English in the
journal. See VernacularTitle.
- VernacularTitle
(O) The article title in the original language, if not in English.
Used only for Latin based alphabets. See our Instructions
for Non-English Languages.
- FirstPage
(R/O) This tag is required if and only if the ELocationID tag is not present and filled. This tag should contain the first page on which the article appears. If an article
appears in more than one language with consecutive pagination,
pagination should be inclusive of all texts.
- LastPage
(O) The last page on which the article appears. If an article appears on one page, this is the same as FirstPage. If an article appears on non-consecutive pages this tag should still contain the last page on which the article appears. If an article appears in more than one language in the same issue, pagination should be inclusive of all the texts.
- Language (O)
The language the article is in. This should be chosen from the language
codes in ISO 639. If unspecified, EN (English) is assumed. If an
article appears in more than one language in the same issue, submit
multiple language tags listed in the order in which the texts appear in
the journal, not in the alphabetical order of the symbols. If one of the
languages is English, enter EN first. NOTE: NLM requires
transliteration of Cyrillic letters as outlined here. See our Instructions
for Non-English Languages.
- AuthorList
(O) The author information must be enclosed in these tags. If a given article has one or more authors, this tag must be submitted.
Authors should be listed in the same order as in the printed article,
and author name format should accurately reflect the printed
article. Do not use all upper case letters.
- Author (R)
Information about a single Author must begin with this tag.
- FirstName
(O) The Author's full first name is required if it appears in the
print or online version of the journal. First initial is
acceptable if full name is not available. To represent a Single Personal Author Name use the FirstName EmptyYN attribute value "Y".
- MiddleName
(O) The
Author's full middle name(s), or initial(s) if the full name(s) not
available.
- LastName
(O) The Author's last name.
- Suffix (O)
The Author's suffix, if any, e.g. "Jr", "Sr", "II", "IV".
Do not include honorific titles, e.g. "M.D.", "Ph.D.".
- CollectiveName
(O) The name of the authoring committee or organization. CollectiveName can be used instead of or in
addition to a personal name.
- Affiliation (O)
The institution(s) that the Author is affiliated with. If a given
article contains affiliations, this tag must be submitted. Please
submit the affiliation for the first author only. If there are
multiple affiliations and it cannot be determined which is the first
author's affiliation, use the first affiliation. The data should be
provided as a simple string within the <Affiliation>
</Affiliation> tags. The body of the affiliation should include
the following data, if applicable, separated by commas: division of
the institution, institution name, city, state, postal or zip code, country
(use USA for the United States)
followed by a period, then a space followed by the e-mail address
which itself should not end in a period. Do not include the word
'e-mail'.
- PublicationType
(O) Used to identify the type of article. The only available PublicationTypes are NEWS, LETTER or
EDITORIAL. The default value, JOURNAL ARTICLE, will be added to
citations if this tag is left blank or an invalid PublicationType
is used.
- ArticleIdList
(O) - The list of Article Identifiers.
- ArticleId
(R) - The Article Identifier. The ArticleId tag includes the IdType attribute, which may include only one of the following
values for each identifier:
pii - controlled publisher identifier (default
value)
doi - Digital Object Identifier
See our Journal
Submission FAQs for more information about
Article Identifiers.
- History (O) The
history of a publication (e.g., received, accepted, revised, published,
ahead of print). Publishers may supply PubDates
and PubStatus in History using the PubDate format detailed above. History PubDate is optional; however the PubDate
within Journal, outlined above, is required. The History PubDate tag includes the PubStatus
attribute, which may contain only one of the following
values for each date in the publication history:
received - date manuscript received for review
accepted - accepted for publication
revised - article revised by publisher or author
aheadofprint - published electronically
*The <History> tag plays an important part in the process of submitting Replacement
Files for Ahead of Print citations.
- Abstract (O) The
article's abstract. Include all text as a single ASCII paragraph.
Headings of structured abstracts; e.g., OBJECTIVE, DESIGN, etc. should
be capitalized and end with a colon, followed by a space before the
text. Our DTD does not support text formatting tags such as line breaks,
italics, or boldface; the only acceptable formatting tags are for
superscript (<sup></sup>) or subscript (<inf></inf>). Do not include KEYWORDS or
bibliographic citations in the Abstract tag.
- CopyrightInformation (O)
The Copyright information associated with this article.
XML File Validator
The PubMed
Citation File Validator is available at http://www.ncbi.nlm.nih.gov/entrez/publisher/citvalidator.cgi.
Use this utility to validate your citation files against the NCBI PubMed DTD before submitting them to NCBI.
Special Characters
Characters not in the standard
ASCII character set must be represented using standard SGML
entity codes. For example, use "ç"
to represent c, cedilla, "’" for
a right single-quote, <sup> for superscript, and <inf> for inferior or subscript. Where they occur
within the text of any tag, the following symbols must be represented by
entities: & (ampersand), < (less than), > (greater than). Where these three occur in tag names or entities, simply use the
ASCII characters. For example:
Entities:
ü NOT &uuml;
' NOT &apos;
Tag Names:
<Month> NOT <Month>
Text:
[P < 0.01] NOT [P < 0.01]
XML File Examples
Standard
File Example - A typical file submitted to PubMed.
Ahead of
Print Example - A file sent "Ahead of Print", and the
Replacement File that follows.
Subset of Language Codes
The following is a subset of the ISO 639 standard
for language codes. NOTE: NLM requires transliteration of Cyrillic
letters as outlined here.
|
CODE
|
LANGUAGE
|
|
EN
|
English
|
|
AF
|
Afrikaans
|
|
SQ
|
Albanian
|
|
AM
|
Amharic
|
|
AR
|
Arabic
|
|
AZ
|
Azerbaijani
|
|
HY
|
Armenian
|
|
BN
|
Bengali
|
|
BS
|
Bosnian
|
|
BG
|
Bulgarian
|
|
CA
|
Catalan
|
|
ZH
|
Chinese
|
|
HR
|
Croatian
|
|
CS
|
Czech
|
|
DA
|
Danish
|
|
NL
|
Dutch
|
|
EO
|
Esperanto
|
|
ET
|
Estonian
|
|
FI
|
Finnish
|
|
FR
|
French
|
|
GD
|
Scottish Gaelic
|
|
KA
|
Georgian
|
|
DE
|
German
|
|
EL
|
Greek, Modern
|
|
HE
|
Hebrew
|
|
HU
|
Hungarian
|
|
HI
|
Hindi
|
|
IS
|
Icelandic
|
|
ID
|
Indonesian
|
|
IT
|
Italian
|
|
JA
|
Japanese
|
|
RW
|
Kinyarwanda
|
|
KO
|
Korean
|
|
LA
|
Latin
|
|
LV
|
Latvian
|
|
LT
|
Lithuanian
|
|
MK
|
Macedonian
|
|
ML
|
Malayalam
|
|
MI
|
Maori
|
|
MS
|
Malay
|
|
MU
|
Multilingual
|
|
NO
|
Norwegian
|
|
FA
|
Persian
|
|
PL
|
Polish
|
|
PT
|
Portuguese
|
|
PS
|
Pushto
|
|
RO
|
Romanian
|
|
RU
|
Russian
|
|
SA
|
Sanskrit
|
|
SR
|
Serbo-Croatian, Cyrillic
|
|
SR
|
Serbo-Croatian, Roman
|
|
SK
|
Slovak
|
|
SL
|
Slovene
|
|
ES
|
Spanish
|
|
SV
|
Swedish
|
|
TH
|
Thai
|
|
TR
|
Turkish
|
|
UK
|
Ukrainian
|
|
UR
|
Urdu
|
|
VI
|
Vietnamese
|
|
CY
|
Welsh
|