6Detection of Phishing URLs Using Machine Learning and Deep Learning Models Implementing a URL Feature Extractor

Abishek Mahesh, Prithvi Seshadri, Shruti Mishra* and Sandeep Kumar Satapathy

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Chennai, Tamil Nadu, India

Abstract

Phishing is a deceitful process by which an attacker tries to steal sensitive information from a naïve user. These types of attacks are generally carried out through emails, text messages, etc. Phishing URLs are a significant threat to cybersecurity professionals and practitioners. A lot of research has been done to tackle the problem of Phishing. Several Machine Learning practitioners have developed ML models which can detect Phishing URLs. However, using Machine Learning and Deep Learning also has its challenges and obstacles. The proposed approach detects Phishing URLs by analyzing URL properties, URL metrics, and other certain URL external services. URL Feature Extractor was created in python to extract features from any URL. A dataset of 88,647 phishing and legitimate URLs is used in this study. Several Machine Learning algorithms such as Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naïve-Bayes, Random Forests (RF), Ada-Boost, Gradient-Boosting and Artificial Neural Networks were used to predict Phishing URLs. The results obtained indicate a reasonable accuracy rate. The Gradient Boosting model produced the best Accuracy, Precision, ...

Get Evolution and Applications of Quantum Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.