Mobile Sensing in the COVID-19 Era: A Review

Background. During the COVID-19 pandemic, mobile sensing and data analytics techniques have demonstrated their capabilities in monitoring the trajectories of the pandemic, by collecting behavioral, physiological, and mobility data on individual, neighborhood, city, and national scales. Notably, mobile sensing has become a promising way to detect individuals’ infectious status, track the change in long-term health, trace the epidemics in communities, and monitor the evolution of viruses and subspecies.Methods. We followed the PRISMA practice and reviewed 60 eligible papers on mobile sensing for monitoring COVID-19. We proposed a taxonomy system to summarize literature by the time duration and population scale under mobile sensing studies.Results. We found that existing literature can be naturally grouped in four clusters, including remote detection, long-term tracking, contact tracing, and epidemiological study. We summarized each group and analyzed representative works with regard to the system design, health outcomes, and limitations on techniques and societal factors. We further discussed the implications and future directions of mobile sensing in communicable diseases from the perspectives of technology and applications.Conclusion. Mobile sensing techniques are effective, efficient, and flexible to surveil COVID-19 in scales of time and populations. In the post-COVID era, technical and societal issues in mobile sensing are expected to be addressed to improve healthcare and social outcomes.


Introduction
Since 2019, the coronavirus disease, namely, COVID-19 or SARS-CoV-2, has already rampaged across the world for more than two years and caused the death of over 5 million people (https://covid19.who.int/). With the wide adoption of customer electronics and personal mobile devices over large populations in this century, COVID-19 has become the firstever global pandemic surveilled digitally. Among these digital surveillance methods, mobile sensing leveraging embedded sensors in mobile devices (e.g., smartphones [1] and wearables [2]) becomes a pervasive way to collect human physiological, behavioral, and environmental data, and trace the human interactions in multiple spatial levels [3]. There are several technical reviews for mobile-enabled technology and data science for healthcare [3][4][5][6], especially for the COVID-19 control [1,[7][8][9][10][11][12]; some also focus on specific COVID-19-caused issues (e.g., mental health support) [13,14]. However, while these articles cover the use of mobile devices to respond to COVID-19, there is still a lack of a review paper investigating the study design, expected health outcomes, and existing limitations of such mobilebased human-subject studies to guide future practice.
In this paper, we conduct a literature review that covers scholarly works leveraging mobile sensing systems to monitor the individual or population health status related to COVID-19. Specifically, we conduct a comprehensive search to retrieve related publications using a set of welldesigned keywords from multiple databases and select eligible papers with refining strategies to focus on the results obtained from human-subject studies or clinical trials. Then, we summarize the health outcomes achieved by these publications, where we are interested in how such studies are designed, particularly how many human subjects (population scale) and how long (time duration) these mobile sensing studies have covered.
We map selected works into a two-dimensional taxonomy system based on the time duration and population scale, where these works are naturally grouped into four clusters, standing for (1) remote detection (N = 15) that identifies individual's infection status just in time, (2) long-term tracking (N = 7) that continuously monitors individual's health status from infection, incubation, symptom onset to recovery, or exacerbation, (3) contact tracing (N = 11) which helps understand how COVID-19 spreads among neighborhoods via in-person interactions within days and weeks, and (4) epidemiological study (N = 27) that reveals the dynamics of virus variants and spread globally through worldwide mobile sensing data collection. We further analyze the technical design, health outcomes, and limitations of each cluster of mobile sensing techniques and discuss the intracluster variations among these techniques due to the use of different sensory data. Finally, we conclude this review and discuss the implications and research directions of the technology (i.e., mobile sensing data and scalable systems) and applications (i.e., for clinicians, healthcare, and policy making).

Methods
This study was designed and implemented through following the methods of the existing review and report [15,16], where we follow the rapid review method proposed in [15] to accelerate the review process and report our findings under the preferred reporting items for systematic reviews and meta-analysis (PRSIMA) framework [16].

Design of the Study.
Mobile sensing studies involve human subjects to collect sensory data streams over individuals and time slots; it also has been applied to conduct data analyses and knowledge discovery for potential outcomes. Significantly, the design of the population scale and time duration plays a crucial role in such studies, as improper or unrealistic population and time coverage will lead to unreliable analytics results, heavy resource consumption, and poor health outcomes. In this study, we aim to investigate the designs of such studies from the population scale and time duration of two perspectives; moreover, what are the outcomes and limitations? Particularly, we aim to search eligible literature which answers the questions as follows: After removing duplicates, we first adopted title/abstract screening to exclude irrelevant papers. Then, we used the eligibility criteria to evaluate the detailed full text and relevance of included papers, and finally, the studies that utilize mobile sensing techniques to collect data over a certain sensing time duration and population scale to address COVID-19-related issues are retained. The details of eligibility criteria include the following: (1) the study must be associated with COVID-19 issues, (2) the study should involve human subjects to evaluate the effectiveness of mobile sensing techniques for COVID-19-related outcomes, (3) the study should report the time duration and population scale that mobile users should be involved to achieve the target health outcome, and (4) the study should report subject measures on health outcomes achieved by mobile sensing techniques. Two authors (Z. Wang and M. Tang) separately conducted the selection processes and reached the final consensus by fully discussing conflicts or disagreements that occurred during this process.

Data Collection and Synthesis.
For each included study, we clearly distinguished the type(s) of mobile data sensed and its sensing time duration and population scale in the data description part. Then, we mapped all the included studies onto a taxonomy system based on the sensing duration and population scale, to classify and understand the distribution of sensed data in these studies. To be specific, according to the commonly selected sensing duration and population scale selected in these practices, the sensing duration was divided into the scales of just in time (seconds, minutes, and other time slots within a day), days, months, and years; the population scale was mapped into the following levels: individual, neighborhood/city, state/nation, and multination/global levels. Finally, we summarize the identified categories of mobile sensing in the COVID-19 era work and introduce the representative works while highlighting their health outcomes, sensor data types, and time duration and population scale that they covered.

2
Health Data Science

Study Selection.
Our search resulted in a total of 495 records, where 375 records were from electronic databases and 120 records were manually searched from top-tier journals and conferences. 426 were further screened after removing 69 duplicates. Then, 196 records were excluded after title and abstract screening. Under proposed eligibility criteria, 177 records were further excluded, where 3 were not associated with COVID-19, 84 did not involve human subjects in the mobile sensing practices (e.g., system work with no human-in-the-loop data collection), 49 did not collect data from participants (e.g., mathematical modeling and simulation study), and 34 did not report subject measures on health outcomes achieved by mobile sensing techniques. Finally, 60 records met the eligibility criteria (see Figure 1).

Study Characteristics.
The 60 eligible studies, as shown in Table 2, deploy mobile sensing techniques to monitor mobile users at a varying population scale (22 were at the individual level, 25 were for neighborhoods/cities, 9 were at the state/national level, and 4 were globally) and in multiple levels of time duration (12 were just in time, 17 were at the level of days, 30 were months, and 1 was for years), so as to primarily obtain various types of sensory data (9 physio-logical, 8 audio, 32 GPS records or call detailed records (CDRs), 5 Bluetooth proximity, 5 self-reported survey, and 3 others).
3.3. Results of Studies Clustered by the Taxonomy System. As every study reports the population scale and time duration that mobile sensing techniques have covered, we map these studies onto a taxonomy system based on the time duration and population scale for mobile sensing. As shown in Figure 2, the selected papers were naturally grouped in four clusters. After taking a closer look at every cluster, we summarize the aims of mobile sensing techniques in every cluster as follows: (1) Remote detection ðN = 15, 25:0%of studiesÞ leverages microphones or wearable sensors to collect acoustic signals and physiological data from individuals and identify the just-in-time infection status of COVID-19 (2) Long-term tracking (N = 7, 11.7% of studies) collects users' self-reported symptoms and physiological data to continuously monitor changes in the individual's health status from infection, asymptotic status, symptom onset to recovery, or progression    Table 3 summarizes the sensing duration, population scale, human data collected, and achieved health outcomes of the four clusters. In the next step, we perform a synthesis analysis of mobile sensing techniques in the four clusters.

Results of Synthesis Analysis.
We present our analysis on every cluster of mobile sensing techniques for COVID-19 from the perspectives of system design, health outcomes, and their positive/negative impacts on social factors.
(2) Outcomes. In terms of health outcomes, existing studies evaluate the effectiveness of proposals through binary classification between positive/negative infectious statuses. For example, Gadaleta et al. [30] exhibited an AUC of 0.83 with a predictive model leveraging both physiological data and self-reported symptoms, where they evaluate their proposals using a dataset collected from 1118 positive and 7032 negative samples. On the other hand, the methods using acoustic signals [20,28] could achieve an AUC range from 0.71 to 0.97. Note that the assessment of classification accuracy in some studies might be inaccurate. For example, many asymptomatic cases might not be tested and counted in [28], resulting in an underestimate of positive samples.
(3) Limitations. The prerequisite of owning wearable devices and the ability to self-report COVID-19-related symptoms might raise the economic and education bars of technology adoption for remote detection [32], resulting in biases in population coverage. Furthermore, all these methods rely on large-scale data collection for training datasets, while unbalanced data collections might cause also biases in prediction results under varying demographics, languages, devices, and physiological/respiratory conditions [22,33].
Overall, the quality of data and population coverage of wearable devices depends on the technology literacy and health literacy of the devices' users. It is important to note that morbidity and mortality of COVID-19 has disproportionately affected communities that are socioeconomically disadvantaged, where lower tech and health literacy may also be more prevalent [34,35]. Therefore, there is a risk that reliance on mobile sensing for remote detection could further worsen inequity and health disparities by underrepresenting the individuals and communities who are actually most impacted by the pandemic.
(2) Outcomes. Incorporating self-reported symptoms, several works [38,40,43] have studied ways to identify risk factors of long-COVID and crucial progressions. For instance, Sudre et al. [43] studied 4182 users among whom 558 users' symptoms lasted more than 4 weeks, 189 lasted more than 8 weeks, and 95 lasted more than 12 weeks. The study also indicated that long-COVID could be characterized by symptoms including fatigue, headache, dyspnea, and anosmia, where obese elders might be more susceptible and should get more prevention. Additionally, a study of 11829 participants completed a questionnaire based on symptoms and underlying conditions identified that the most significant risk factors for exacerbations were diabetes and chronic heart disease [38]. With physiological signals such as oxygen saturation, respiratory rate, heart rate, and skin temperature, some works leverage novel wearable sensors with machine learning models to automatically detect clinical deterioration [40]. After all, it is difficult to evaluate the outcomes of long-term tracking measures, as progression or recovery might be also affected by many other factors in the long term [44].
(3) Limitations. A major concern of symptom-based methods is the accuracy of the self-reported symptoms and the effectiveness of using self-reported symptoms to identify disease progression. As individuals' ability of symptom checking or selfdiagnosing may vary [45,46], education again becomes a key factor to ensure the quality of health outcomes here. In addition, one major limitation of self-reported methods is the added burden and the resulting low adherence that can affect data quality and utility.
Moreover, mobile sensing for long-term tracking and contact tracing raise similar concerns as remote detection, regarding dependence on tech literacy and health literacy for accurate data and representative reach into populations. Data collection and prediction favoring communities of higher socioeconomic status pose risk of worsening inequity and health disparities for those who are not included [47,48].
(2) Outcomes. The spatial coverage of two contact tracing ways (i.e., proximity-vs. location-based systems) may vary,  Comprehensively understanding the spread and its mechanism from the perspective of human mobility 5 Health Data Science while proximity-based solution straightforwardly records the in-person contacts between users [50,52] and locationbased tracing estimates exposure risk to COVID-19 through calculating the colocation estimations between mobile users, i.e., two mobile users appear in the same location at the same time or within a short period [58,59]. Thus, proximitybased contact records are fine grained but sparse, as only a few people will always have Bluetooth on [60] and the  On the other hand, the location-based contact records are usually coarse grained in spatial and depending on the spatiotemporal granularity to define "locations" or a "colocation" event. To be detailed, when people are in close proximity, their devices communicate through exchanging encrypted tokens. Later, when one is tested positive, she can opt-in share the key of her anonymous token to the public, where the ones holding the corresponding token could decode it and get notified as a contact [50,52], while the location-based methods estimate the colocation exposure risk in a neighborhood/community space by collectively analyzing the interactions between people's historical trajectories. For example, Xiao et al. [55] designed an AI predictive framework that screens human mobility at the urban neighborhood level and predicts infection risks. Berke et al. [57] proposed to divide the city into grids to assess and communicate users' exposure risk by tracing intersections of the GPS trajectory with the infected on each spatial grid. Furthermore, the effectiveness evaluation of contact tracing is not always trustworthy [13]. The abovementioned works are either based on well-designed experiments under laboratory conditions [52] or run simulations with mobile data under assumptions [50], which were not verified in largescale scenarios.
(3) Limitations. For contact tracing, the privacy issue is the major concern impeding the public's willingness to participate in, despite that several privacy-preserving methods for data collection exist. Technically, few privacy-preserving standards have been pervasively adopted and the magnitude of the risk of indiscriminate data collection and chronic privacy breaches depends on the capabilities and attitudes of data managers, not users [61]. The privacy issue may also be compounded in communities with a high prevalence of mistrust of the government or healthcare institutions [62]. As a result, these communities may be less willing to share data and, at the same time, to pursue preventative strategies such as masking, distancing, and vaccination, placing them at higher risk for COVID-19 infection and more severe illness.

Epidemiological Study
(1) Design. Epidemiological study collects the GPS locations or CDR data from massive users. However, the population scales under coverage of mobile sensing in these studies range from (1) Table 7). The global human mobility surges the worldwide spread of the virus from one country to the others, while the variants of COVID-19 were naturally caused by the mutations within large infected populations. During the pandemic and the pervasive adoption of mobile devices, Internet tycoons have enabled large-scale mobility tracking and provided the aggregated mobility data usable. Some academic research obtains nonpublic data through collaboration with the industry and government; others have processed coarse grained (e.g., city-level statistic results) data available to the public. Platforms such as Google Mobility Report Google Mobility Report, https://http://www.google.com/covid19/  Tracker COVID Data Tracker, https://covid.cdc.gov/coviddata-tracker/, provide these large-scale mobility data, including aggregated counts of inflows and outflows between spatial regions over time series; some also provide the population activity index (e.g., traffic index) within the spatial areas (e.g., a city). Mobile-enabled epidemiological studies generally investigate the associations between mobilemeasured human mobility and pandemic spread, sometimes combining with other external data (e.g., policy, social media, and vaccination data) [1,8].
(2) Outcomes. Population health outcomes of epidemiological study include the public health policy making and nonpharmaceutical interventions (NPIs) at various scales.
For epidemiological study at the city scale [66,85], the main outcomes lie in assessing the efficiency of regional policies (e.g., social distancing) and identifying high-risk regions. For example, in the Boston metropolitan area, Martin-Calvo et al. [66] built colocation networks at three layers (i.e., community, households, and schools) to test the social distancing policy; results showed that most of the infections occur at community and households layers, where school closures might be ineffective and costly for the overall wellbeing. Chang et al. [85] proposed to figure out the inequities and gaps between races and socioeconomic groups with a mobility network model.
On state, nation, and global levels, mobile data has been collected to monitor large-scale human mobility to evaluate higher-level policies such as lockdown and travel bans. For example, using population migration data among cities collected by Baidu company, Kraemer et al. [80] verified that the spatial distribution of COVID-19 cases in China at both city and province levels is significantly correlated to human mobility. After the implementation of control measures, such as travel ban, such correlation dropped, which indicated that the drastic control measures have paid off. Chinazzi et al. [88] leveraged Baidu data plus global travel data to map other counties' relative risk of case importation and simulated the travel and transmissibility reductions under international travel restrictions. Similarly, using CDRs in three regions of Italy, the timing and efficacy of the lockdown have been estimated to guide further restriction adjustment [77]. Pan et al. [75] constructed a social distancing index to quantify and understand the influence of policies on people's dynamic social behaviors prolonged for 4 months.
(3) Limitations. The use of mobile data to inform COVID-19 epidemiological studies is after all a secondary use, compared to its primary use for location-based services. Up to now, few standardized frameworks have been protecting users' privacy and confidentiality of such practices [8,61]. Not to mention that even anonymized and aggregated data can be reidentified to recover individuals' trajectories under some circumstances [90]. There still needs actions to make mobile users decide how, when, and for what purposes the data could be collected or released, though the data is only used for research purposes, released to the public after location-based aggregation, and only statistical results are accessible [91].
Along with privacy, tracking COVID-19 trends on a global scale may underestimate prevalence in low-or middle-income countries, where testing is less widely available than in countries with more resources [92]. This can lead to unequal representation of populations in the data. If the data is then used to inform policy decisions, the severity of the pandemic may not be adequately addressed for populations who lack resources for mobile tracking, in addition to lacking resources for accurate case reporting [93].

Discussion
In this section, with identified limitations shown in Table 8, we highlight recent discoveries and viewpoints on implications, benefits, and limitations of mobile sensing practices.

Technical Challenges and Opportunities.
Technical limitations hindering the feasibility of using mobile sensing for COVID-19 include data quality and system adoption issues. For data quality, imperfect data is an inherent issue for mobile sensing, as it collects individuals' data following a consent-based, opt-in standard from a wide variety of mobile devices. Such practice naturally leads to the data sparse, out-of-sync, or even missing on time scales and biased, heterogeneous over populations. We suggest that advances in data analytics and machine learning methods capable of handling sparse, heterogeneous, and multimodal mobile sensing data streams could be helpful. For system adoption, in this study, we can observe significant clustering phenomena of mobile sensing applications on time and population scales (see Figure 2). This phenomenon indicates that there is a tradeoff between the data granularity and population coverage of the currently used mobile devices, depending on the technological literacy and health literacy

Clinical and Societal
Implications. The use of mobile sensing to identify and track individuals with COVID-19 infection or at increased risk of infection has implications for clinicians. Patients can be monitored for ongoing symptoms or worsening the clinical status remotely, for example, after being discharged from a hospitalization for COVID-19. Remote monitoring can also help keep patients out of the hospital, reducing demand on the limited space and staff in the emergency department and hospital wards. However, data generated by mobile sensing would have to be timely and actionable, in order to be useful to clinicians, who are already overburdened in pandemic conditions. Public health professionals can also benefit from tracking of individuals and communities within the populations that they serve, but with similar attention to generating actionable data.
Implications of mobile sensing for healthcare systems include the potential to monitor and predict the spread of pandemics. Taking care of patients with COVID-19 infection and its complications requires significant investment of staff, supplies, space, and other limited resources. Mobile sensing that can aid predictions of impending surges in pandemic cases could provide healthcare systems and public health departments with an early warning signal to inform the allocation of resources, which can be challenging to mobilize.
From a health policy standpoint, efforts are needed to mitigate potential threats to privacy while also harnessing the benefits of mobile sensing technology. Pandemic control strategies at the local and national scale, such as closure of schools and businesses, have far-reaching social and economic consequences. Such decisions should rely on accurate epidemiologic data and prediction models. Attention to equity and health disparities is crucial, so that communities most at risk from pandemics are not further disadvantaged by unequal access to technology or excluded from algorithms used to inform resource allocation. Policy makers would need to consider strategies to promote technological literacy and health literacy in all communities, in particular those at high risk of harm from pandemics. In addition, involvement of community stakeholders in decision-making that balances risks and benefits is needed to build trust. Lessons learned from the current COVID-19 pandemic can inform planning for future pandemics, which will continue to be a concern as emerging infectious diseases arise.

Conclusions
Mobile sensing has shown its power to pervasively and effectively monitor COVID-19 in varying population scales and time duration. Existing works have demonstrated the potential of mobile sensing techniques to identify individuals' infectious status through acoustic/wearable sensing and symptom self-reporting, to track the long-term selfreported symptoms and physiological data to monitor the progression or recovery from the disease, to estimate the exposure risk to COVID-19 and trace the spread among neighborhoods through mining colocation events from mobility traces, and to surveil the pandemic in city, state, nations, and global scales for public health policy making. For future research, we wish to see more works where computer scientists, clinicians, and epidemiologists design and implement the study collaboratively with experts in social science, public policy, and human factors to enable more effective, scalable, and socially equal mobile-based sensing systems for future needs.

Disclosure
All authors have completed the ICMJE disclosure form. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.