Emerging Threats and Countermeasures in Cybersecurity
by Gulshan Shrivastava, Rudra Pratap Ojha, Shashank Awasthi, Kavita Sharma, Himani Bansal
4Class-Imbalanced Problems in Malware Analysis and Detection in Classification Algorithms
Bidyapati Thiyam1*, Chadalavada Suptha Saranya2 and Shouvik Dey1
1Department of Computer Science and Engineering, National Institute of Technology Nagaland, Dimapur, India
2Capgemini, Hermeslaan 9, Machelen, Belgium
Abstract
With the use of the Internet, the growth of threats is increasing enormously making it vital to study malware analysis and detection in protecting computer systems and networks from harmful threats. Today, many methodologies employ existing data to predict outcomes for new data points with varying success rates. Machine learning fundamentals suggest that these models be trained with balanced class distributions. However, this does not align with real-world practices. Most datasets used for identifying malicious threats suffer heavily from class imbalance problems, leading to substantial challenges in the effectiveness of the sampling method. This invites poor classification efficacy subsequently hampering successful classifications. In this chapter, we review different techniques to handle class imbalance problems using machine learning classifiers and various evaluation metrics for other standard datasets, namely, NSL-KDD, UNSW-NB15, CIC-DDoS2019, and Edge-IIoT.
Keywords: Malware analysis, class imbalance problem, classification, UNSW_NB15 dataset, Edge-IIoT dataset, CIC-DDoS2019 dataset
4.1 Introduction
Cybersecurity heavily relies on malware analysis, which involves ...