Send to

Choose Destination
NPJ Digit Med. 2020 Feb 4;3:16. doi: 10.1038/s41746-020-0222-x. eCollection 2020.

Lymelight: forecasting Lyme disease risk using web search data.

Author information

1Google, Mountain View, CA USA.
2Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA USA.
3Computational Epidemiology Lab, Boston Children's Hospital, Boston, MA USA.
4Department of Pediatrics, Harvard Medical School, Massachusetts, USA.
Contributed equally


Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight-a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation (p < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.


Computational science; Epidemiology; Infectious diseases

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center