Performance of Rapid Antigen Tests for COVID-19 Diagnosis: A Systematic Review and Meta-Analysis

The identification of viral RNA using reverse transcription quantitative polymerase chain reaction (RT-qPCR) is the gold standard for identifying an infection caused by SARS-CoV-2. The limitations of RT-qPCR such as requirement of expensive instruments, trained staff and laboratory facilities led to development of rapid antigen tests (RATs). The performance of RATs has been widely evaluated and found to be varied in different settings. The present systematic review aims to evaluate the pooled sensitivity and specificity of the commercially available RATs. This review was registered on PROSPERO (registration number: CRD42021278105). Literature search was performed through PubMed, Embase and Cochrane COVID-19 Study Register to search studies published up to 26 August 2021. The overall pooled sensitivity and specificity of RATs and subgroup analyses were calculated. Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) was used to assess the risk of bias in each study. The overall pooled sensitivity and specificity of RATs were 70% (95% CI: 69–71) and 98% (95% CI: 98–98), respectively. In subgroup analyses, nasal swabs showed the highest sensitivity of 83% (95% CI: 80–86) followed by nasopharyngeal swabs 71% (95% CI: 70–72), throat swabs 69% (95% CI: 63–75) and saliva 68% (95% CI: 59–77). Samples from symptomatic patients showed a higher sensitivity of 82% (95% CI: 82–82) as compared to asymptomatic patients at 68% (95% CI: 65–71), while a cycle threshold (Ct) value ≤25 showed a higher sensitivity of 96% (95% CI: 95–97) as compared to higher Ct value. Although the sensitivity of RATs needs to be enhanced, it may still be a viable option in places where laboratory facilities are lacking for diagnostic purposes in the early phase of disease.


Introduction
Coronavirus disease 2019 (COVID- 19), an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has evolved into a global pandemic and is still a major health concern around the world. The initial outbreak of SARS-CoV-2 started in Wuhan, China, in December 2019 [1,2]. As of 17 August 2021, SARS-CoV-2 infected over 211 million people and killed over 4.4 million people worldwide. Those infected with the SARS-CoV-2 experience a variety of symptoms, including fever, cough, exhaustion, shortness of breath, headache, sore throat, and loss of smell and taste [3][4][5]. The symptoms develop normally after 2 days to 2 weeks and last up to 3 weeks or longer for patients with mild to severe COVID-19 infection [1]. The SARS-CoV-2 infection can be divided into five stages. These stages are asymptomatic, mild, moderate, severe, and critical.

Search Strategy
The peer reviewed articles were searched on 26 August 2021 without restricting the publication date. The articles were searched through PubMed, Embase and Cochrane COVID-19 Study Register by using several keywords with the following search string:

Data Analysis
Three authors (M.F.K., A.J.N.J. and M.F.S.) individually extracted the number of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) from each study and entered them into an Excel datasheet. Disagreeable findings were discussed, and when in doubt, the authors sought verification. The sensitivity and specificity for each of the studied antigen were calculated using RT-qPCR as the reference standard. The number of true positive outcomes was divided by the sum of true positive and false negative outcomes to calculate the sensitivity (1). The number of true negative outcomes was divided by the sum of true negative and false positive outcomes to calculate the specificity (2). Forest plots were used to demonstrate the comparative performance of the rapid antigen tests: Speci f icity = TN TN + FP (2)

Quality Assessment
The quality of each study was assessed using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [19]. Patient selection, index test, reference standard and flow and timing are the four core domains of the QUADAS-2 tool Table 1). The risk of bias was classified as low, high or unclear for each domain. Five authors (M.F.K., A.J.N.J., M.F.S., M.A.N. and K.S.) independently performed the assessment to judge the quality of each study. Disagreements among the authors were resolved by discussion. Table 1. QUADAS-2 risk of bias assessment criteria.

Domains Criteria for Low Risk Assessment
Patient selection Patient enrolment strategy is specified and free of bias. A case-control design and inappropriate exclusions are avoided.

Index test
The index test results are interpreted without knowledge of the results of the reference standard. The conduct or interpretation of the index test does not introduce bias.

Reference standard
The reference standard correctly classifies the target condition. The reference standard results are interpreted without knowledge of the results of the index test. The reference standard, its conduct or its interpretation do not introduce bias.

Flow and timing
There is an appropriate interval between the index test(s) and reference standard. All patients receive the same reference standard. All patients included in the analysis and patient flow do not introduce bias.

Search Results
A total of 1732 articles were identified through databases and register searching ( Figure 1). Of these, 271 duplicates were identified and removed using data tool in Excel. After screening the titles, 961 articles that were not primary research articles and unrelated to antigen-based detection for COVID-19 were excluded. Of the 500 abstracts screened, 369 articles that did not meet the basis of selection criteria were excluded. After screening the full text articles, 37 articles were excluded. The remaining 94 studies that fulfilled our selection criteria were included in this systematic review.

Index test
The index test results are interpreted without knowledge of the results of the reference standard. The conduct or interpretation of the index test does not introduce bias.

Reference standard
The reference standard correctly classifies the target condition. The reference standard results are interpreted without knowledge of the results of the index test. The reference standard, its conduct or its interpretation do not introduce bias.

Flow and timing
There is an appropriate interval between the index test(s) and reference standard. All patients receive the same reference standard. All patients included in the analysis and patient flow do not introduce bias.

Search Results
A total of 1732 articles were identified through databases and register searching (Figure 1). Of these, 271 duplicates were identified and removed using data tool in Excel. After screening the titles, 961 articles that were not primary research articles and unrelated to antigen-based detection for COVID-19 were excluded. Of the 500 abstracts screened, 369 articles that did not meet the basis of selection criteria were excluded. After screening the full text articles, 37 articles were excluded. The remaining 94 studies that fulfilled our selection criteria were included in this systematic review.

Characteristics of the Included Studies
The characteristics of the 94 studies reporting the performance of rapid antigen tests (RATs) for SARS-CoV-2 were summarized in Table 2. Most of the studies (n = 84) were published in 2021, and the remaining studies were published in 2020. The majority of the studies employed clinical samples from European countries which include Germany (n = 14), Spain (n = 11), Italy (n = 10), the Netherlands (n = 7), France (n = 6), Belgium (n = 5) and Switzerland (n = 3). In Asia, most studies were conducted in Japan (n = 7), followed by China (n = 3) and India (n = 2). The United States (n = 8) and Chile (n = 3) were among the American countries reported the performance of RAT for SARS-CoV-2. A total of 25 studies used Panbio™ COVID-19 Ag RDT (Abbott, Jena, Germany), 11 used STANDARD Q COVID-19 Ag Home Test (SD Biosensor, Seoul, South Korea), 10 used SARS-CoV-2 Rapid Ag Test (Roche, Basel, Switzerland) and 9 used Lumipulse G SARS-CoV-2 Ag (Fujirebio, Tokyo, Japan). Most of the studies (n = 71) used nasopharyngeal swabs, 12 used pools of nasopharyngeal and throat swabs, 4 used nasal swabs, 3 used saliva, 2 used throat swabs, 1 study used a pool of nasopharyngeal and nasal swabs and 1 study compared the performance of nasopharyngeal swabs, saliva and sputum.

Quality of Articles
The QUADAS checklist was completed for all included studies (Supplementary Material Table S1). The QUADAS-2 criteria for 94 studies included in this systematic review were presented in Figure 2. Majority of the included studies (83%) have low risk of patient selection bias. Twelve (13%) studies have high risk of patient selection bias due to the case-control study design. These studies specifically recruited clinical samples known to be uninfected or infected with coronavirus. The remaining studies (4%) have unclear risk of patient selection bias. These studies were not case-control but provided insufficient details about the inclusion and exclusion criteria. About 68% (64 out of 94 studies) have unclear risk of index bias due to unclear information on whether the index test results were interpreted with knowledge of reference test results. The remaining studies (32%) have low risk of index bias as the reference tests ware blinded from each other and results were recorded independently by two readers.
With regards to the reference standard risk of bias, almost all the studies (87%) have low risk of bias, as these studies used similar RT-PCR as the reference standard and the reference tests results were interpreted without knowledge of index test results. Eleven studies (12%) have unclear risk of reference standard bias due to these studies did not provide enough information about whether reference standard results were interpreted without knowledge of the results of the index test. Only 1% of the studies has high risk of reference standard bias because the reference standard results were interpreted with knowledge of the results of the index test. Most of studies (87%) have low risk of flow and timing bias. Seven studies (8%) have unclear risk of flow and timing bias due to no information on whether the samples for a reference test and the index test were taken at the same time. Only five studies (5%) have high risk of flow and timing bias. One of the five studies has high risk of flow and timing bias because samples were collected from the same patients at multiple time points. Another three studies have high risk of flow and timing bias due to the use of different samples for index and reference test and one more due to use of different standard references. selection bias. Twelve (13%) studies have high risk of patient selection bias due to the case-control study design. These studies specifically recruited clinical samples known to be uninfected or infected with coronavirus. The remaining studies (4%) have unclear risk of patient selection bias. These studies were not case-control but provided insufficient details about the inclusion and exclusion criteria. About 68% (64 out of 94 studies) have unclear risk of index bias due to unclear information on whether the index test results were interpreted with knowledge of reference test results. The remaining studies (32%) have low risk of index bias as the reference tests ware blinded from each other and results were recorded independently by two readers. With regards to the reference standard risk of bias, almost all the studies (87%) have low risk of bias, as these studies used similar RT-PCR as the reference standard and the reference tests results were interpreted without knowledge of index test results. Eleven studies (12%) have unclear risk of reference standard bias due to these studies did not provide enough information about whether reference standard results were interpreted without knowledge of the results of the index test. Only 1% of the studies has high risk of reference standard bias because the reference standard results were interpreted with knowledge of the results of the index test. Most of studies (87%) have low risk of flow and timing bias. Seven studies (8%) have unclear risk of flow and timing bias due to no information on whether the samples for a reference test and the index test were taken at the same time. Only five studies (5%) have high risk of flow and timing bias. One of the five studies has high risk of flow and timing bias because samples were collected from the 0% 20% 40% 60% 80% 100% Flow and timing risk of bias

Meta-Analysis of the Sensitivity and Specificity of Rapid Antigen Tests
A total of 74,445 samples tested using 30 different rapid antigen tests (RATs) and confirmed with RT-PCR as reference test were included in the meta-analysis. The sensitivity and specificity of the RATs ranged from 37% to 90% and 65% to 100%, respectively ( Figure 3). Of the 30 RATs analyzed, 52% (n = 16) reported at least 70% sensitivity. CoviNAg ELISA Kit (XEMA, Moscow, Russia) and Sienna-Clarity COVID-19 Ag RTC (Salofa Oy, Salo, Finland) showed highest sensitivity of 90% while MEDsan SARS-Cov-2 Ag (MEDsan GmbH, Hamburg, Germany) showed lowest sensitivity of 37%. Nonetheless, when those kits tested using small number of samples were excluded from analysis, SARS-CoV-2 Ag Test (LumiraDx GmbH, Cologne, Germany) showed the highest sensitivity of 86% followed by Lumipulse G SARS-CoV-2 Ag (Fujirebio, Tokyo, Japan) and BinaxNOW™ COVID-19 Ag Self-Test (Abbott, Jena, Germany) with sensitivity of 83% and 79%, respectively.
(MEDsan GmbH, Hamburg, Germany) showed lowest sensitivity of 37%. Nonetheless, when those kits tested using small number of samples were excluded from analysis, SARS-CoV-2 Ag Test (LumiraDx GmbH, Cologne, Germany) showed the highest sensitivity of 86% followed by Lumipulse G SARS-CoV-2 Ag (Fujirebio, Tokyo, Japan) and BinaxNOW™ COVID-19 Ag Self-Test (Abbott, Jena, Germany) with sensitivity of 83% and 79%, respectively.   Subgroup analysis based on the presence of symptoms showed RATs gives higher sensitivity (82%) among symptomatic as compared with asymptomatic (68%). The specificity of RATs was similar in both symptomatic and asymptomatic patients (98%). Cycle threshold (Ct) values are inversely correlated with the viral load in a specimen. Among those patients with higher viral load (Ct value ≤25), the RATs showed sensitivity and specificity of 96% and 99%, respectively. Meanwhile, the sensitivity of RATs dropped to 69% when used on patients with low viral load (Ct value >25). A subgroup analysis of the RATs based on countries showed sensitivity and specificity ranging from 58% to 87% and 96% to 100%, respectively. The sensitivity was observed to be highest in China (87%) followed by Italy (81%), Chile (77%), the United States (77%), Belgium (73%), Japan and the Netherlands (72%) and Spain (71%). On the other hand, the sensitivity was observed to be lower than average (70%) in Germany, France and India with all showed sensitivity of 58%. In regard to specificity, comparably high specificities, ranging from 96% to 100% were observed in all countries, except in Belgium (84%).
After excluding the case-control studies, the overall pooled sensitivity and specificity of RATs were 72% (95% CI: 71-73) and 98% (95% CI: 98-98), respectively (Table 4). Meanwhile, the diagnostic performance of different specimens showed similar result to the subgroups analysis without excluding case-control studies. For analysis based on symptoms, the RATs performance after excluding case-control studies showed lower sensitivity of 80% (95% CI: 78-81) and higher specificity of 99% (95% CI: . Similarly, the RATs performance showed lower sensitivity and higher specificity when tested for samples from those patients with Ct value >25. The performance of RATs for asymptomatic and patients with Ct value ≤25 remain similar. Further analysis, which includes comparative performance of studies that blinded the index test and studies that did not blind the index test were performed (Table 5). Studies where the index text was blinded to the results of the reference standard had a sensitivity and specificity of 72% and 97%, respectively. Meanwhile, studies where the index test was not blinded to the results of the reference standard has a sensitivity and specificity of 72% and 97%, respectively. Table 5. Comparative performance of studies that blinded the index test and studies that did not blind the index test.

Discussion
Recently, there have been two systematic reviews on the performance of RATs for SARS-CoV-2 detection [16,17]. These reviews only included studies up to 13 January and 30 April 2021. Our review provided more comprehensive analysis on the performance of RATs until 26 August, following the inclusion of 94 studies. Nonetheless, most of the studies still from European countries such as Germany, Spain, Italy, the Netherlands, France, Belgium and Switzerland. Lack of studies from West and Southeast Asian countries, South America and Africa, highlighting the current gap pertaining to understanding the RATs performance in such geographical areas. A similar observation was reported by the previous study, in which, most of the included studies were from Germany, Spain and Italy [17]. The plausible reason may be attributed to the fact that most kits are manufactured in European countries; thus, such test kits were easier to obtain in those countries as compared to others where supply shortages are commonly reported. The second reason is that these countries were badly affected by COVID-19 in the beginning of the outbreak. Therefore, the COVID-19 RAT is becoming popular across European countries as governments' efforts to slow the spreading of the virus by tracking infected individuals.
Panbio™ COVID-19 Ag RDT (Abbott, Jena, Germany) and STANDARD Q COVID-19 Ag Home Test (SD Biosensor, Seoul, South Korea) were used in the majority of studies (25 and 11 studies, respectively) followed by SARS-CoV-2 Rapid Ag Test (Roche, Basel, Switzerland) and Lumipulse G SARS-CoV-2 Ag (Fujirebio, Tokyo, Japan). This could be due to the fact that Panbio™ COVID-19 Ag RDT (Abbott, Jena, Germany) and STANDARD Q COVID-19 Ag Home Test (SD Biosensor, Seoul, South Korea) are two RAD kits that are currently included under the 'WHO Emergency Use Listing for In vitro diagnostics (IVDs) Detecting SARS-CoV-2 [113]. There is still a lack of evaluation for newly developed test kits such as CoviNAg ELISA Kit (XEMA, Moscow, Russia) and Sienna-Clarity COVID-19 Ag RTC (Salofa Oy, Salo, Finland). Therefore, future diagnostic evaluation studies should include the newly developed test kits so that more data can be obtained for future comparative analyses.
Immunochromatography, which involves spotting antibodies onto nitrocellulose membranes that interact with specific antigens in patient samples is the basis of RATs. The antigen-antibody interaction can be visualised manually or by using an immunofluorescence machine reader. The genome of SARS-CoV-2 comprises genes the responsible for four structural proteins such the spike (S), envelope (E), membrane (M) and nucleocapsid (N) [114]. N protein is frequently employed as a target analyte in RATs for COVID-19 diagnosis, as shown in Table 2. N-protein is mostly expressed during the early stages of SARS-CoV-2 infection and has the least amount of variation in its gene sequence, indicating that it is a stable protein [115].
Subgroup analysis of the RATs based on countries showed sensitivity and specificity ranging from 58% to 87% and 96% to 100%, respectively. The sensitivity was observed to be highest in China (87%) followed by Italy (81%), Chile (77%), the United States (77%), Belgium (73%), Japan and the Netherlands (72%) and Spain (71%). On the other hand, the sensitivity was observed to be lower than average (70%) in Germany, France and India with all showing a sensitivity of 58%. The previous systematic review reported that the sensitivity of RATs in the population of Europe and America was higher as compared to that of Asia and Africa [17]. Their finding was different with our finding. One plausible reason is that the RATs evaluated previously were manufactured from Europe and America which may affect the test performance in Asia and Africa after repeated freeze-thaw procedures during transportation [117]. In regard to specificity, comparably high specificities, ranging from 96% to 100%, were observed in all countries, except in Belgium (84%).
According to the WHO, RATs should be prioritized for use in symptomatic individuals who meet the COVID-19 case definition, as well as to test asymptomatic individuals at high risk of infection, particularly in settings where NAAT testing capacity is limited. Thus, this review also analysed the sensitivity and specificity of RATs based on symptoms. RATs showed similar specificity (98%) for symptomatic and asymptomatic patients. However, the sensitivity of RATs for symptomatic patients was greater than asymptomatic patients (82% vs. 68%). A similar finding was reported by the previous systematic reviews, in which, the sensitivity of RATs is higher when used for symptomatic patients [16,17]. In addition, there is a clear association between Ct values of RT-qPCR and RATs' sensitivity and specificity. The lower the Ct value (≤25), the greater the sensitivity and specificity of RATs, whereas the higher the Ct value (<25), the lower the sensitivity and specificity of RATs. Ct values, on the other hand, cannot be directly compared between tests and must be interpreted with caution because they are impacted by sample type, sample collection timing, and assay design [118].
The severity of the disease, the timing of sample collection, the types of samples, and sample handling techniques all influence antigen levels in samples [20]. It is hard to determine if the difference in observed sensitivity is due to the test's performance or the qualities of the samples utilized in the test without this information. Unfortunately, the majority of the studies included in this review did not provide information on antigen levels in the samples. Information about disease severity and sample collection timing are often missing. Future research should include this information to enable for a more accurate assessment of diagnosis test performance as well as the identification of their actual limitations. Throughout this systematic review, our study identified the relevant peer-review articles published to reach the objective findings about the performances of the antigen detection for diagnostic of COVID-19. This study followed the PRISMA guidelines to reduce the risk of bias and meet the objective of the review. The use of several predetermined keywords and guidelines in this study during the screening process can assure the reproducibility of this systematic review. The screening process that started from the title screening followed by abstract and full text screening was documented properly to avoid any risk of bias in the systematic review.
Based on the quality assessment of the QUADAS-2, most of the included studies showed a low of bias. However, there were several studies that indicate high risk of bias for the QUADAS-2 assessment. The high risk in the patient selection was due to the casecontrol study design, while the unclear risk of the patient selection was due to insufficient details about the patient inclusion and exclusion criteria. For the risk of the index test, most of the included studies were unclear as the authors did not mention whether the index test results were interpreted with knowledge of the reference test results. The high risk of the reference standard bias was due to the results that were not interpreted with knowledge of the result of the index test. For the risk of the flow and timing bias, high risk of the bias due to the several reference standards or collected samples for the index and references tests at separate time.
Nevertheless, our systematic review has three main limitations. First, this systematic review revealed that there was considerable publication bias in the included studies as the study on the performances of the antigen detection of the SARS-CoV-2 were still in the early stages of development and almost all reported diagnostics performance assessments were done and published by the same research group. We anticipated that such bias will be reduced after these antigen performances were completely evaluated by various and independent research teams. Second, the presence of the SARS-CoV-2 antigen in the patients does not always indicate the presence of the viable virus, and we need to examine whether SARS-CoV-2 antigen-positive patients are contagious to other people. Lastly, the possibility of protein mutations in SARS-CoV-2 variants, including the newly revealed Omicron, affecting the sensitivity and specificity of RATs is not reported in this review.

Conclusions
In conclusion, our systematic review and meta-analysis revealed the current performance of RATs for SARS-CoV-2 detection. Overall diagnostic sensitivity and specificity of these RATs were 70% and 98%, respectively. Quality assessment showed majority of the studies have low risk of bias. However, several studies showed high risk of patient selection bias due to case-control study design and high risk of flow and timing bias due to the use of different samples for index and reference test and use of different standard references. Regarding index test risk of bias, the majority did not mention whether or not the authors performed double-blinded index test. Future study should attempt to perform double-blinded index test as such improvement in study design would reduce index test risk of bias. CoviNAg ELISA Kit (XEMA, Moscow, Russia) and Sienna-Clarity COVID-19 Ag RTC (Salofa Oy, Salo, Finland) had the highest diagnostic sensitivity among all RATs, while MEDsan SARS-Cov-2 Ag (MEDsan GmbH, Hamburg, Germany) had the lowest diagnostic sensitivity. More studies from the Middle East and Southeast Asian, South American, and African countries are warranted so that a comprehensive subgroup analysis based on regions can be performed. Nasal swabs showed the highest sensitivity followed by nasopharyngeal swabs, throat swabs and saliva. The RATs showed higher sensitivity for those patients with symptoms and Ct value ≤25. Comparative performance of the RATs using less invasive sampling approaches is still lacking. Improvement in these key areas would help to boost acceptability and accessibility for large practical of RATs for rapid surveillance of SARS-CoV-2, allowing immediate isolation of the infected individuals and reducing the disease's transmission.