Use of supervised machine learning to detect abuse of COVID-19 related domain names

Zheng Wang

doi:10.1016/j.compeleceng.2022.107864

Use of supervised machine learning to detect abuse of COVID-19 related domain names

Comput Electr Eng. 2022 May:100:107864. doi: 10.1016/j.compeleceng.2022.107864. Epub 2022 Mar 7.

Author

Zheng Wang¹

Affiliation

¹ National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, 20899, MD, USA.

Abstract

A comprehensive evaluation of supervised machine learning models for COVID-19 related domain name detection is presented. One representative conventional machine learning implementation and nineteen state-of-the-art deep learning implementations are evaluated. The deep learning implementation architectures evaluated include the recurrent, convolutional, and hybrid models. The detection rate metrics and the computing time metrics are considered in the evaluation. The result reveals that advanced deep learning models outperform conventional machine learning models in terms of detection rate. The results also show evidence of a tradeoff between detection rate and computing speed for the selection of machine learning models/architectures. High-frequency lexical analysis is provided for a better understanding of the COVID-19 related domain names. The limitations, implications, and considerations of the use of supervised machine learning to detect abuse of COVID-19 related domain names are discussed.

Keywords: COVID-19; Classification; Deep learning; Machine learning; Malicious domain name.