Evaluation of canine detection of COVID‐19 infected individuals under controlled settings

Abstract Reverse transcription polymerase chain reaction (RT‐PCR) is currently the standard diagnostic method to detect symptomatic and asymptomatic individuals infected with Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). However, RT‐PCR results are not immediate and may falsely be negative before an infected individual sheds viral particles in the upper airways where swabs are collected. Infected individuals emit volatile organic compounds in their breath and sweat that are detectable by trained dogs. Here, we evaluate the diagnostic accuracy of dog detection against SARS‐CoV‐2 infection. Fifteen dogs previously trained at two centres in Australia were presented to axillary sweat specimens collected from known SARS‐CoV‐2 human cases (n = 100) and non‐cases (n = 414). The true infection status of the cases and non‐cases were confirmed based on RT‐PCR results as well as clinical presentation. Across dogs, the overall diagnostic sensitivity (DSe) was 95.3% (95%CI: 93.1–97.6%) and diagnostic specificity (DSp) was 97.1% (95%CI: 90.7–100.0%). The DSp decreased significantly when non‐case specimens were collected over 1 min rather than 20 min (p value = .004). The location of evaluation did not impact the detection performances. The accuracy of detection varied across dogs and experienced dogs revealed a marginally better DSp (p value = .016). The potential and limitations of this alternative detection tool are discussed.

self-present for testing, leading to disease transmission and a potential outbreak.
To effectively reduce transmission of SARS-CoV-2, reliable, scalable, accurate and inexpensive testing to detect both symptomatic and asymptomatic individuals is required (Wiersinga et al., 2020). The standard diagnostic test for COVID-19 is SARS-CoV-2 reverse transcription polymerase chain reaction (RT-PCR) performed on respiratory specimens (nasopharyngeal swabs or lower respiratory tract samples) (Wiersinga et al., 2020). The detectability varies with the adequacy of specimen collection, time from onset of symptoms and specimen source (Sethuraman et al., 2020;Wang et al., 2020). Performing SARS-CoV-2 RT-PCR testing is labour intensive, time consuming, expensive and susceptible to a shortage of reagents. Cost can be prohibitive for resource poor countries which may also be unable to access reagents.
Protracted turn-around time can hamper case and contact identification, adversely affecting public health responses. A scalable, costeffective, non-invasive, rapid screening tool could improve targeted testing and public health responses, helping to control the spread of disease.
Volatile organic compounds (VOCs) are emitted by our body, breath and sweat, and reflect our metabolic condition (Shirasu & Touhara, 2011). Development of infectious or metabolic disease results in changes in VOCs profile with some being disease specific and potentially used as diagnostic olfactory markers (Shirasu & Touhara, 2011). Canines can detect VOCs and, if formally trained, may be able to discriminate between infected and non-infected humans in some specific diseases (Guest et al., 2019;McCulloch et al., 2006;Taylor et al., 2018). Recent pilot studies demonstrated that dogs were able to detect patients infected (symptomatic or not) with SARS-CoV-2 using respiratory secretion specimens (Jendrny et al., 2020), heat-treated urine and saliva samples (Essler et al., 2021) and sweat samples (Grandjean et al., 2020). Respiratory secretions are likely to contain viral particles therefore sweat specimens were investigated instead to reduce the risk of infection to the operators (Fathizadeh et al., 2020).
This study evaluated the accuracy of detector dogs in identifying SARS-CoV-2-infected individuals from axillary sweat specimens at two dog training centres in Australia. This report complies with the STARD 2015 standards to report diagnostic accuracy (Cohen et al., 2016). A 'non-case specimen' was an axillary sweat sample collected from a participant who yielded a negative RT-PCR against SARS-CoV-2 (quantification cycle (Cq) > 40) on the day of collection (specimens sourced in France and the UAE) or from a participant who resided in an area with negligible risk of infection (i.e., from a state with no case of SARS-CoV-2 community transmission for more than 30 days) and did not experience COVID-19 symptoms. Specimens from persons without COVID-19 symptoms and without suspicious history of contact but yielding a Cq value of ≥34 and <40 were not included. All samples were collected on similar swab types (Australian negative specimens were sampled using French or Australian gauzes).

Specimen screening
For detection, specimens were transferred into a clean glass jar and connected to a presentation stainless steel cone (hide) of a construction similar to those developed previously (Grandjean et al., 2020). The study dogs were trained to display a 'conditioned response' behaviour (sustained sit with focus on the target) on a hide containing the target odour through reward-based training techniques. A key training requirement was hide screening independent of handler cues to eliminate the potential 'Clever Hans bias' . Dogs were trained at two separate sites, in Adelaide (Roseworthy Veterinary School) and Melbourne (ABF Canine Detection Unit), and the same locations were used to assess their detection accuracy. The study dogs were trained following the Standard Operating Protocols provided in Supplementary Material S1.
Within an evaluation run, a total of nine hides were used with one case specimen or none per run. Dog handlers were blinded to both -the hide and the run true status. The presence of a case specimen and the specimens' hide order was formally randomized using a smartphone application (Random Number Generator ©2013 Nicholas Dean). Running of individual dogs was ordered in such a manner that each dog had an equitable number of first passes on a set of specimens. All specimens used for this study were new to the dogs (not used during training) and were collected from new patients or volunteers sourced from the same locations as for the specimens used during training. Case specimens were used once per dog but could be used with other dogs. Non-case specimens were used up to eight times within and between dogs (specimens collected from each armpit were used two to four times). Where possible, case and non-case specimens sourced from the same location were presented in the same run to avoid possible interference of background odours. This was the case for all runs using specimens from UAE. As all but one specimen from France were cases and all specimens from Australia were noncases (gauzes from France were also used to collect sweat from Australian non-cases), these two locations were used conjointly within runs.
A primary data recorder, who was not blinded to the true status of the hides, was located in a booth with one-way screens so they could have direct sight on the hides but could not be seen by the blinded handler or the blinded secondary/back-up data recorder. The primary data recorder who had knowledge of the sample status was behind a oneway screen, always remained silent and could not give any cues during the evaluations. The dog's handler signalled his response to the data recorder by hand gesture. The data recorder then pressed a light visible to the handler to indicate if the dog correctly identified a case sample, allowing the handler to give appropriate positive reinforcement.
Reinforcement for positive identifications was used to reduce prob-lems with motivation in the dogs during the extending testing required for these trials.
Data recording involved recording individual specimen identifiers and hide order, whether or not a hide was searched, a dog's search behaviours and the presence (or absence) of any conditioned response (i.e., sitting in front of the hide). Each run was recorded on video for data quality control and assurance. The data from both recorders were then compared at the end of each day and video evidence was examined to resolve any conflicts.

Evaluation of detection accuracy
The evaluation of detection accuracy was conducted at the individual hide level. Hides not sampled by the dog (dog did not screen the hide) were excluded from the analysis.
The detection accuracy was measured using the conventional parameters used for diagnostic test accuracy -diagnostic sensitivity (DSe) and specificity (DSp). Here, the DSe refers to the proportion of hides containing a case specimen where the dog dis- Two separate logistic regression models were built -one for DSe using the results from hides containing only a case specimen and one for DSp using the results from hides containing a non-case specimen.
To estimate the overall DSp across all dogs, the models included 'dog' and 'specimen' as crossed random effects to account for the fact that a given specimen could be repeatedly screened within and across dogs.
For the DSe modelling, only the specimen random effect was used because case specimens were only used once per dog.

Evaluation runs description
A total of 514 impregnated specimens were used during the evaluation runs -100 were from infected cases ( The final dataset included a total of 5260 hide screenings -803 with case specimens (703 using French and 100 using UAE specimens) and 4457 with non-case specimens (12 using French, 3741 using Australian and 336 using UAE specimens).

Dog-specific detection accuracy
Accuracy of detection varied across dogs (

DISCUSSION
This study provides evidence to support that detector dogs are an accurate and effective tool to determine people infected with SARS-CoV-2 using an easily implemented collection method in placing a gauze swab in the axillary area for a short time. The DSe and DSp of the individual dogs involved in the trial varied, with some operating at 100%, and all comparing favourably with the diagnostic accuracy of RT-PCR testing.
All results from the detector dogs are compared with RT-PCR, which are not perfect and whose accuracy depends on viral load being shed.  (Ziv, 2017). There is a lack of scientific studies of specific training protocols for odour detection by dogs in order to determine which are the most effective in terms of time to train to criterion and accuracy of detection (Hayes et al., 2018). In a study using rats trained to detect odours, an intermixed training method (including more than one odour at a time) was more effective than sequential single-odour training (Keep et al., 2021). In the present study, the sweat samples would have presented an intermixed odour, which may have helped in training the dogs to generalize across different samples. With the potential use of detector dogs for not only COVID but also other diseases such as malaria (Guest et al., 2019), further research is needed to optimize the selection and training methods used with these dogs.
Although the dogs may generalize the scent of a case for SARS-CoV-2 infection, context is also important for detector dogs (Gazit et al., 2005). In the early stages of deployment and in a new environment, it will be important to validate their sensitivity and specificity again prior to full deployment. An important facet of the training protocol used is that an axillary sweat sample can be easily and quickly provided by people. There is also the potential for dogs to screen people directly, for example, dogs could scent the axillary area of people while they are seated. Other protocols using respiratory or heat inactivated urine or saliva samples (Essler et al., 2021;Jendrny et al., 2020) may not be amenable to deployment in areas such as airports due to either risk of infection or inability to supply the sample in a timely manner.
Although this study has verified the diagnostic accuracy of detector dogs for SARS-CoV-2 infection in humans, it is important to understand the limitations of extrapolating from a controlled to an operational setting. It is impossible to replicate an operational setting in a controlled study. For example, dogs working in the field on a daily basis may encounter zero prevalence of SARS-CoV-2 positive samples or clusters with much higher prevalence than tested in this study. It will be important to verify the diagnostic accuracy of detector dogs for SARS-CoV-2 human infections in future studies as these dogs are deployed.

CONCLUSION
This study demonstrates the diagnostic accuracy of detector dogs for screening people infected with SARS-CoV-2. Detector dogs may not replace the existing screening with RT-PCR, but could be a complementary method that could be quickly and effectively deployed to provide immediate results. Their additional value may lie in being able to detect infection in pre-symptomatic people before the virus is shed and when RT-PCR is still negative. Further research is needed to uncover which VOC is specific to SARS-CoV-2 infection and to assess VOC persistence through the course of infection. Our study shows that trained dogs can accurately detect SARS-CoV-2 infection using axillary sweat samples.
Canine screening has potential as a scalable, rapid, efficient and reliable tool.