PMC Open Access Subset

The PMC Open Access Subset includes more than 3.4 million journal articles and preprints that are made available under license terms that allow reuse. Not all articles in PMC are available for text mining and other reuse, many have copyright protection, however articles in the PMC Open Access Subset are made available under Creative Commons or similar licenses that generally allow more liberal redistribution and reuse than a traditional copyrighted work. The PMC Open Access Subset is one part of the PMC Article Datasets.

Files for the PMC Open Access Subset are available for automated retrieval in several formats:

  • individual articles packages on the PMC FTP Service include the full text and metadata files in XML, PDF, and plain text, as well as images and supplementary materials
  • bulk packages on the PMC FTP Service include XML or plain text format files for 100,000s of articles per package
  • Individual XML or plain text files are available for retrieval in a number of ways, including the PMC FTP Service, the cloud, the PMC OAI Service, E-Utilities and BioC API

Please note:

  • The AWS RODA, PMC OAI-PMH service, the PMC FTP service, NCBI E-Utilities and BioC API are the only services that may be used for automated retrieval of PMC content. Systematic retrieval (or bulk retrieval) of articles through any other automated process is prohibited.
  • License terms vary. Please refer to the license statement in each article for specific terms of use.
  • Users of this dataset are directly and solely responsible for compliance with copyright restrictions and are expected to adhere to the terms and conditions defined by the copyright holder (see the PMC Copyright Notice).

Within the PMC Open Access Subset, there are three groupings:

  • Commercial Use Allowed - CC0, CC BY, CC BY-SA, CC BY-ND licenses
  • Non-Commercial Use Only - CC BY-NC, CC BY-NC-SA, CC BY-NC-ND licenses; and
  • Other - no machine-readable Creative Commons license, no license, or a custom license.

To access the complete OA Subset, you should retrieve all of these groupings. Details about the files and directory structure are available on the FTP Service page and the PMC Article Datasets on AWS page.

Find all Open Access Subset articles in:

Learn about additional search filters that restrict results to certain license types.


The PMC OA Subset articles are available for retrieval via the AWS RODA, PMC FTP service, NCBI E-Utilities PMC OAI-PMH and BioC API.

Support Center

Last updated: Mon, 29 Nov 2021