Analysis of scientific production on the new coronavirus (COVID-19): a bibliometric analysis

ABSTRACT BACKGROUND: The pandemic of the new coronavirus has culminated in a scientific race to seek knowledge about this virus and its treatments, vaccines and preventive strategies, in order to reduce its impact on healthcare and economics worldwide. Hence, it is important to recognize the efforts of researchers who are at the forefront of investigations relating to the new coronavirus. OBJECTIVE: The present study was carried out with the aim of analyzing the world scientific production relating to COVID-19. DESIGN AND SETTING: Exploratory and descriptive bibliometric study conducted in the city of Teresina (PI), Brazil. METHOD: ISI Web of Knowledge/Web of Science (WOS) was chosen as the database. Data-gathering was carried out in May 2020. The data analysis was performed using the HistCiteTM software, version 9.8.24, and the VOSviewer bibliometric analysis software, version 1.6.8. RESULTS: 2,625 published papers that included descriptors within the scope of this investigation were identified. These articles were published in 859 different journals that are indexed in WOS, by 9,791 authors who were linked to 3,365 research institutions, located in 105 countries. CONCLUSION: Ascertaining scientific production through a bibliometric analysis is important in order to guide researchers on what has already been produced and what is being researched, so as to be able to address gaps in knowledge through future research.


INTRODUCTION
On December 31, 2019, the World Health Organization (WHO) reported the first outbreak of pneumonia in Wuhan City, Hubei Province, China. It was discovered shortly afterwards that this pneumonia was due to a new coronavirus, with genetic characteristics, mode of infection and hosts distinct from the other coronaviruses that were already known. It was given the scientific name of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the infection (disease) that it causes was named COVID-19 (coronavirus disease 2019). 1,2 At the end of January 2020, this epidemic outbreak turned into a pandemic and thus into a public health emergency of international interest. 3 According to information from the Pan American Health Organization, the pandemic had affected more than 180 countries as of June 30, 2020, with confirmation of 10,185,374 cases of COVID-19 worldwide and 503,862 deaths from it.
Since the beginning of the pandemic, governments and scientists have been working on solutions to prevent rapid spread of the virus and contagion. Healthcare organizations have coordinated a series of protocols and guidelines aimed at improving rapid circulation of information about the pathology and possible treatment protocols and thereby mitigating the impact of the disease. However, despite the research carried out, the transmission mechanisms and clinical spectrum of the disease are still not fully understood, and there is still a lack of treatments and vaccines to control COVID-19. 2,4,5 In view of this problem, it is essential that scientific production of studies on COVID-19 should be analyzed and expanded. Bibliometric studies have the aim of investigating the collaborative and scientific production network on a research topic, which in this case is on the new coronavirus. This knowledge facilitates recognition of researchers who produce and publish the most on the topic.

OBJECTIVE
The questions that guided this study were the following: Which information sources are of value regarding COVID-19, through the metrics of authorship and citation? What analysis has been done on the indicators of the dynamics and evolution of scientific and technological information about COVID-19? Thus, in the light of these questions, the objective of this study was to analyze the worldwide scientific production relating to COVID-19.

Research design
This was an exploratory and descriptive bibliometric study with a quantitative approach that was conducted through defining a database for consultation and the criteria to be used in data-gathering and data representation and analysis. 6

Selection criteria
No refinement filters relating to fields of knowledge, countries or languages of the studies were used. All records of published studies in which the scope of the study included descriptors relating to the research topic were covered.

Data-gathering
The steps followed three procedures: defining the database to be consulted; determining the criteria to be used for data-gathering; and defining the representation and analysis of the data gathered. indicate the exact representation of terms with more than one word. Data-gathering was carried out by searching for these terms, which represented article titles, abstracts, authors' keywords and created keywords. In this manner, 2,625 articles were identified, and these were used as a set of articles for the bibliometric analyses proposed.
It should be noted that all articles found were selected for this investigation, since the focus of the study was to ascertain all the production that had occurred up to the end of the data-gathering period.

Data processing and analysis
The data gathered were then analyzed by exporting these data to the HistCite TM software, version 9. In addition to these data generated through the software, aspects of the ten articles most cited across the entire WOS were elucidated in order to identify their main contributions to the topic of COVID-19. In addition, an analysis on indicators of the dynamics and evolution of scientific and technological information on this topic was carried out.
The VOSviewer software, version 1.6.8, was used to analyze co-competition networks between keywords. VOSviewer (Visualization of Similarities Viewer) is part of a free software suite for bibliometric analysis and visualization. It was developed by Van Eck and Waltman and is available at: www.vosviewer.
com. In analyzing these co-competition networks, it was possible to map out possible research topics relating to COVID-19. The sizes of the nodes that were produced in the networks indicated the frequency of occurrence of keywords, and the relationships between nodes became stronger as the proximity between them became greater.

Ethical aspects
Since this was a bibliometric study, it was not necessary to sub-   The ten most cited authors relating to the topic of COVID-19 are shown in Table 3 according to author, title, journal and number of citations, which ranged from 74 to 623. 8-17 Figure 1 shows the keyword co-occurrence networks for the 2,625 documents in the sample. To facilitate visualization, construction of the network was restricted to keywords with ten or more occurrences, which resulted in 49 nodes that were organized into six different colors, namely: blue, red, green, lilac, yellow and turquoise (clusters). These were the words that most frequently determined the central theme of a body of documents.

DISCUSSION
The COVID-19 pandemic had led to publication of a large number of scientific studies on this subject, conducted around the world. The global dimensions of the direct and indirect effects of the coronavirus have required quick responses, which have placed scientific production and dissemination at the center of attention. Thus, the bibliometric analysis carried out through this study have enabled characterization of these researchers during the pandemic. [8][9][10][11][12][13][14][15][16][17] China has contributed the largest number of scientific published papers during the COVID-19 pandemic, according to the analysis carried out here ( Table 1). This can be explained by the fact that China is home to more than 3.61 million licensed doctors, and that this country was the cradle of the current pandemic. 17 The United States is second in the ranking. This can be This study was carried out shortly after the discovery of the virus and, thus, the authors showed that gaps existed with regard to knowledge of the origin, epidemiology, duration of human transmission and clinical spectrum of the disease.
Another factor observed was co-occurrence of relationships between pairs of keywords that were determined from the numbers  of articles in the database that occurred together, whether in the title, in the abstract or in the list of keywords. [23][24] In analyzing these networks, it was possible to map out possible research topics on COVID-19. The size of the node indicated the frequency of occurrence of a keyword, and the closer together they were, the stronger their relationship was.  However, the present bibliometric study has limitations.
Only a single database was used, i.e. Web of Science TM . Although this is a referential platform for scientific citations that was designed to support scientific and academic research with wide coverage in the fields of science and social sciences, it may be necessary to deepen the search using other scientific databases, through further studies. The high number of studies indexed in this database every day made it impossible to analyze them daily, and this can also be cited as a limitation. This led the present researchers to choose to delimit a period within which to obtain data, in order to be able to proceed with their investigation and discussion. Thus, some information may have been lost in this process and the reality may not match the data gathered in the present study.

CONCLUSION
The sources of value regarding COVID-19, which were recog-