Measuring the drafting alignment of patent documents using text mining

PLoS One. 2020 Jul 10;15(7):e0234618. doi: 10.1371/journal.pone.0234618. eCollection 2020.

Abstract

How would an inventor, entrepreneur, investor, or patent examiner quantify the extent to which the inventive claims listed in a patent document align with patent specification? Since a specification that is poorly aligned with the inventive claims can render an invention unpatentable and can invalidate an already issued patent, an effective measure of alignment is necessary. We define a novel measure of drafting alignment using Latent Dirichlet Allocation (LDA). The measure is defined for each patent document by first identifying the latent topics underlying the claims and the specification, and then using the Hellinger distance to find the proximity between the topical coverages. We demonstrate the use of the novel measure for data processing patent documents related to cybersecurity. The properties of the proposed measure are further investigated using exploratory data analysis, and it is shown that generally alignment is positively associated with the prior patenting efforts as well as the tendency to include figures in a document.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Mining / methods*
  • Humans
  • Inventions / trends*
  • Patents as Topic

Grants and funding

D.K. was awarded two mini grants (000056 and 000218) by the Babson Faculty Research Fund (https://www.babson.edu/academics/teaching-and-research/babson-faculty-research-fund/), during the completion of this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.