Molecular Epidemiology Surveillance of SARS-CoV-2: Mutations and Genetic Diversity One Year after Emerging

Pathogens. 2021 Feb 9;10(2):184. doi: 10.3390/pathogens10020184.

Abstract

In December 2019, the first cases of the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) were identified in the city of Wuhan, China. Since then, it has spread worldwide with new mutations being reported. The aim of the present study was to monitor the changes in genetic diversity and track non-synonymous substitutions (dN) that could be implicated in the fitness of SARS-CoV-2 and its spread in different regions between December 2019 and November 2020. We analyzed 2213 complete genomes from six geographical regions worldwide, which were downloaded from GenBank and GISAID databases. Although SARS-CoV-2 presented low genetic diversity, there has been an increase over time, with the presence of several hotspot mutations throughout its genome. We identified seven frequent mutations that resulted in dN substitutions. Two of them, C14408T>P323L and A23403G>D614G, located in the nsp12 and Spike protein, respectively, emerged early in the pandemic and showed a considerable increase in frequency over time. Two other mutations, A1163T>I120F in nsp2 and G22992A>S477N in the Spike protein, emerged recently and have spread in Oceania and Europe. There were associations of P323L, D614G, R203K and G204R substitutions with disease severity. Continuous molecular surveillance of SARS-CoV-2 will be necessary to detect and describe the transmission dynamics of new variants of the virus with clinical relevance. This information is important to improve programs to control the virus.

Keywords: SARS-CoV-2; genetic diversity; molecular surveillance; natural selection; non-synonymous substitution.