Open Access Subset

The PMC Open Access Subset some or all openaccess content is a part of the total collection of articles in PMC. The articles in the OA Subset are made available under a Creative Commons or similar license that generally allows more liberal redistribution and reuse than a traditional copyrighted work.

Open Access Subset FTP Clean Up

On March 18, 2019, PMC will no longer provide bulk packages of Open Access (OA) Subset text and XML at the top level directory of the FTP Service. These files were superseded by the Commercial Use and Non-Commercial Use bulk packages located in the oa_bulk subdirectory. Read the complete announcement.

Please note the following:

  • The license terms are not identical for all of the articles in this subset. Please refer to the license statement in each article for specific terms of use.
  • The majority of the articles in PMC are subject to traditional copyright restrictions and are not part of this subset.
  • Users are directly and solely responsible for compliance with copyright restrictions and are expected to adhere to the terms and conditions defined by the copyright holder (see the PMC Copyright Notice).
  • Some licenses restrict commercial use on open access content. If you are accessing the OA Subset for commercial purposes, please limit your use to the Commercial Use Collection.
  • Some journals use the label "open access" for an article that is available free at time of publication, but is still subject to traditional copyright restrictions. Such articles are not part of this subset.
  • The PMC OAI service and the PMC FTP service are the only services that may be used for automated downloading of articles from the OA Subset. Systematic retrieval (bulk downloading) of articles through any other automated process is prohibited.

Identifying Articles to Download

Use the OA Web Service to discover downloadable resources from the OA Subset and identify their licenses. For example, this API can be used to find PDFs of all articles that have been updated since a specified date.

Use one of the six index files provided as part of the FTP Service to assist with locating an open access article on the FTP site. Search these index files for either a PMC accession number (PMCID) or a PubMed ID (PMID). The matching entry will point you to the specific FTP directory and file name for the article. You may also search these index files for a license to locate all articles with a particular license.

Non-Commercial Use Collection

If you intend to use articles from the PMC OA Subset only for non-commercial purposes, these are your options:

FTP Service:

  • Complete Files: Download complete files for any article from the “oa_package” directory. Use the index files named “oa_file_list.txt” or “oa_file_list.csv” to find the specific directory location for an article or to filter articles by license.
  • PDFs: Download PDFs directly from the “oa_pdf” directory. The files “oa_non-comm_use_pdf.csv/txt” are indices in .csv and .txt formats to the contents of this directory.
  • Bulk: Download any bulk XML/txt article package from the "oa_bulk" directory.

    Note:

    You will need to download both “non_comm_use.*.tar.gz” and “comm_use.*.tar.gz” to access the complete OA subset that is available for non-commercial purposes.

Learn more about the FTP Service.

OAI-PMH Service:

Commercial Use Collection

Within the OA Subset, there is a Commercial Use Collection that includes only OA Subset articles that have a machine-readable “CC BY” (Creative Commons Attribution Only) or “CC0” (Creative Commons public domain) license.

If you intend to use articles from the PMC OA Subset for commercial purposes, your options include:

FTP Service:

  • Complete Files: From the “oa_package” directory, download complete files for only the articles listed in “oa_comm_use_file_list.*”
  • PDFs: If you only want article PDFs you still must download the complete article package. Do not download freestanding PDFs from the “oa_pdf” directory and do not use “oa_non_comm_use_pdf.*” to try to identify what files are available to you.
  • Bulk: Download only those bulk XML/txt article packages that are available for commercial use, i.e., files named “comm_use.*.tar.gz”

Note:

If you are accessing the content for commercial purposes, you should not download any of the “non_comm_use.*.tar.gz” bulk packages or any individual article packages for articles that are not included in “oa_comm_use_file_list.*”

Learn more about the FTP Service.

OAI-PMH Service:

How to Search for Articles by Creative Commons License

Search filters are available in PMC and PubMed for finding articles in the OA Subset with specific Creative Commons (CC) licenses. For descriptions of these licenses, please see the Creative Commons site, About the Licenses. Please note that not all articles in the OA Subset have a CC license.

License type Filter in PMC Filter in PubMed
Any CC license cc license pmc cc license
CC BY (Attribution) cc by license pmc cc by license
CC BY-ND (Attribution, no derivatives) cc by-nd license pmc cc by-nd license
CC BY-NC (Attribution, noncommercial) cc by-nc license pmc cc by-nc license
CC BY-NC-ND (Attribution, noncommercial, no derivatives) cc by-nc-nd license pmc cc by-nc-nd license
CC BY-NC-SA (Attribution, noncommercial, share-alike) cc by-nc-sa license pmc cc by-nc-sa license
CC BY-SA (Attribution, share-alike) cc by-sa license pmc cc by-sa license
CC0 (Public domain) cc0 license pmc cc0 license

These filters are based on license information, which is provided to PMC by publishers and other content providers, as encoded by the machine-readable identifiers in the source XML of each journal article. Please note that, in some cases, there are discrepancies between these machine-readable identifiers and the actual text of the license statements. In February 2013, PMC instituted new rules to help ensure consistency of the tagging of the licenses, which apply to all newly received content.

Support Center

Last updated: Fri, 4 Jan 2019