Hands-On Artificial Intelligence for Cybersecurity

Book description

Build smart cybersecurity systems with the power of machine learning and deep learning to protect your corporate assets

Key Features

  • Identify and predict security threats using artificial intelligence
  • Develop intelligent systems that can detect unusual and suspicious patterns and attacks
  • Learn how to test the effectiveness of your AI cybersecurity algorithms and tools

Book Description

Today's organizations spend billions of dollars globally on cybersecurity. Artificial intelligence has emerged as a great solution for building smarter and safer security systems that allow you to predict and detect suspicious network activity, such as phishing or unauthorized intrusions.

This cybersecurity book presents and demonstrates popular and successful AI approaches and models that you can adapt to detect potential attacks and protect your corporate systems. You'll learn about the role of machine learning and neural networks, as well as deep learning in cybersecurity, and you'll also learn how you can infuse AI capabilities into building smart defensive mechanisms. As you advance, you'll be able to apply these strategies across a variety of applications, including spam filters, network intrusion detection, botnet detection, and secure authentication.

By the end of this book, you'll be ready to develop intelligent systems that can detect unusual and suspicious patterns and attacks, thereby developing strong network security defenses using AI.

What you will learn

  • Detect email threats such as spamming and phishing using AI
  • Categorize APT, zero-days, and polymorphic malware samples
  • Overcome antivirus limits in threat detection
  • Predict network intrusions and detect anomalies with machine learning
  • Verify the strength of biometric authentication procedures with deep learning
  • Evaluate cybersecurity strategies and learn how you can improve them

Who this book is for

If you're a cybersecurity professional or ethical hacker who wants to build intelligent systems using the power of machine learning and AI, you'll find this book useful. Familiarity with cybersecurity concepts and knowledge of Python programming is essential to get the most out of this book.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Hands–On Artificial Intelligence for Cybersecurity
  3. About Packt
    1. Why subscribe?
  4. Contributors
    1. About the author
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Section 1: AI Core Concepts and Tools of the Trade
  7. Introduction to AI for Cybersecurity Professionals
    1. Applying AI in cybersecurity
    2. Evolution in AI: from expert systems to data mining
      1. A brief introduction to expert systems
      2. Reflecting the indeterministic nature of reality
      3. Going beyond statistics toward machine learning
      4. Mining data for models
    3. Types of machine learning
      1. Supervised learning
      2. Unsupervised learning
      3. Reinforcement learning
    4. Algorithm training and optimization
      1. How to find useful sources of data
      2. Quantity versus quality
    5. Getting to know Python's libraries
      1. Supervised learning example – linear regression
      2. Unsupervised learning example – clustering
      3. Simple NN example – perceptron
    6. AI in the context of cybersecurity
    7. Summary
  8. Setting Up Your AI for Cybersecurity Arsenal
    1. Getting to know Python for AI and cybersecurity
      1. Python libraries for AI
      2. NumPy as an AI building block
      3. NumPy multidimensional arrays
      4. Matrix operations with NumPy
      5. Implementing a simple predictor with NumPy
      6. Scikit-learn
      7. Matplotlib and Seaborn
      8. Pandas
    2. Python libraries for cybersecurity
      1. Pefile
      2. Volatility
      3. Installing Python libraries
    3. Enter Anaconda – the data scientist's environment of choice
      1. Anaconda Python advantages
      2. Conda utility
      3. Installing packages in Anaconda
      4. Creating custom environments
      5. Some useful Conda commands
      6. Python on steroids with parallel GPU
    4. Playing with Jupyter Notebooks
      1. Our first Jupyter Notebook
      2. Exploring the Jupyter interface
      3. What's in a cell?
      4. Useful keyboard shortcuts
      5. Choose your notebook kernel
      6. Getting your hands dirty
    5. Installing DL libraries
      1. Deep learning pros and cons for cybersecurity
      2. TensorFlow
      3. Keras
      4. PyTorch
      5. PyTorch versus TensorFlow
    6. Summary
  9. Section 2: Detecting Cybersecurity Threats with AI
  10. Ham or Spam? Detecting Email Cybersecurity Threats with AI
    1. Detecting spam with Perceptrons
      1. Meet NNs at their purest – the Perceptron
      2. It's all about finding the right weight!
      3. Spam filters in a nutshell
      4. Spam filters in action
      5. Detecting spam with linear classifiers
      6. How the Perceptron learns
      7. A simple Perceptron-based spam filter
      8. Pros and cons of Perceptrons
    2. Spam detection with SVMs
      1. SVM optimization strategy
      2. SVM spam filter example
      3. Image spam detection with SVMs
      4. How did SVM come into existence?
    3. Phishing detection with logistic regression and decision trees
      1. Regression models
      2. Introducing linear regression models
      3. Linear regression with scikit-learn
      4. Linear regression – pros and cons
      5. Logistic regression
      6. A phishing detector with logistic regression
      7. Logistic regression pros and cons
      8. Making decisions with trees
      9. Decision trees rationales
      10. Phishing detection with decision trees
      11. Decision trees – pros and cons
    4. Spam detection with Naive Bayes
      1. Advantages of Naive Bayes for spam detection
      2. Why Naive Bayes?
    5. NLP to the rescue
      1. NLP steps
      2. A Bayesian spam detector with NLTK
    6. Summary
  11. Malware Threat Detection
    1. Malware analysis at a glance
      1. Artificial intelligence for malware detection
      2. Malware goes by many names
      3. Malware analysis tools of the trade
      4. Malware detection strategies
      5. Static malware analysis
      6. Static analysis methodology
      7. Difficulties of static malware analysis
      8. How to perform static analysis
      9. Hardware requirements for static analysis
      10. Dynamic malware analysis
      11. Anti-analysis tricks
      12. Getting malware samples
      13. Hacking the PE file format
        1. The PE file format as a potential vector of infection
        2. Overview of the PE file format
        3. The DOS header and DOS stub
        4. The PE header structure
        5. The data directory
        6. Import and export tables
      14. Extracting malware artifacts in a dataset
    2. Telling different malware families apart
      1. Understanding clustering algorithms
      2. From distances to clusters
      3. Clustering algorithms
      4. Evaluating clustering with the Silhouette coefficient
      5. K-Means in depth
      6. K-Means steps
      7. K-Means pros and cons
      8. Clustering malware with K-Means
    3. Decision tree malware detectors
      1. Decision trees classification strategy
      2. Detecting malwares with decision trees
      3. Decision trees on steroids – random forests
      4. Random Forest Malware Classifier
    4. Detecting metamorphic malware with HMMs
      1. How malware circumvents detection?
      2. Polymorphic malware detection strategies
      3. HMM fundamentals
      4. HMM example
    5. Advanced malware detection with deep learning
      1. NNs in a nutshell
      2. CNNs
      3. From images to malware
      4. Why should we use images for malware detection?
      5. Detecting malware from images with CNNs
    6. Summary
  12. Network Anomaly Detection with AI
    1. Network anomaly detection techniques
      1. Anomaly detection rationales
      2. Intrusion Detection Systems
        1. Host Intrusion Detection Systems
        2. Network Intrusion Detection Systems
        3. Anomaly-driven IDS
      3. Turning service logs into datasets
      4. Advantages of integrating network data with service logs
    2. How to classify network attacks
      1. Most common network attacks
      2. Anomaly detection strategies
      3. Anomaly detection assumptions and challenges
    3. Detecting botnet topology
      1. What is a botnet?
      2. The botnet kill chain
    4. Different ML algorithms for botnet detection
      1. Gaussian anomaly detection
      2. The Gaussian distribution
      3. Anomaly detection using the Gaussian distribution
      4. Gaussian anomaly detection example
      5. False alarm management in anomaly detection
      6. Receiver operating characteristic analysis
    5. Summary
  13. Section 3: Protecting Sensitive Information and Assets
  14. Securing User Authentication
    1. Authentication abuse prevention
      1. Are passwords obsolete?
      2. Common authentication practices
      3. How to spot fake logins
      4. Fake login management – reactive versus predictive
      5. Predicting the unpredictable
      6. Choosing the right features
      7. Preventing fake account creation
    2. Account reputation scoring
      1. Classifying suspicious user activity
      2. Supervised learning pros and cons
      3. Clustering pros and cons
    3. User authentication with keystroke recognition
      1. Coursera Signature Track
      2. Keystroke dynamics
      3. Anomaly detection with keystroke dynamics
      4. Keystroke detection example code
      5. User detection with multilayer perceptrons
    4. Biometric authentication with facial recognition
      1. Facial recognition pros and cons
      2. Eigenfaces facial recognition
      3. Dimensionality reduction with principal component analysis (PCA)
      4. Principal component analysis
      5. Variance, covariance, and the covariance matrix
      6. Eigenvectors and Eigenvalues
      7. Eigenfaces example
    5. Summary
  15. Fraud Prevention with Cloud AI Solutions
    1. Introducing fraud detection algorithms
      1. Dealing with credit card fraud
      2. Machine learning for fraud detection
      3. Fraud detection and prevention systems
      4. Expert-driven predictive models
      5. Data-driven predictive models
      6. FDPS – the best of both worlds
      7. Learning from unbalanced and non-stationary data
        1. Dealing with unbalanced datasets
        2. Dealing with non-stationary datasets
    2. Predictive analytics for credit card fraud detection
      1. Embracing big data analytics in fraud detection
      2. Ensemble learning
        1. Bagging (bootstrap aggregating)
        2. Boosting algorithms
        3. Stacking
      3. Bagging example
      4. Boosting with AdaBoost
      5. Introducing the gradient
      6. Gradient boosting
      7. eXtreme Gradient Boosting (XGBoost)
      8. Sampling methods for unbalanced datasets
      9. Oversampling with SMOTE
      10. Sampling examples
    3. Getting to know IBM Watson Cloud solutions
      1. Cloud computing advantages
      2. Achieving data scalability
      3. Cloud delivery models
      4. Empowering cognitive computing
    4. Importing sample data and running Jupyter Notebook in the cloud
      1. Credit card fraud detection with IBM Watson Studio
      2. Predicting with RandomForestClassifier
      3. Predicting with GradientBoostingClassifier
      4. Predicting with XGBoost
    5. Evaluating the quality of our predictions
      1. F1 value
      2. ROC curve
      3. AUC (Area Under the ROC curve)
      4. Comparing ensemble classifiers
        1. The RandomForestClassifier report
        2. The GradientBoostingClassifier report
        3. The XGBClassifier report
      5. Improving predictions accuracy with SMOTE
    6. Summary
  16. GANs - Attacks and Defenses
    1. GANs in a nutshell
      1. A glimpse into deep learning
      2. Artificial neurons and activation functions
      3. From artificial neurons to neural networks
      4. Getting to know GANs
      5. Generative versus discriminative networks
      6. The Nash equilibrium
      7. The math behind GANs
      8. How to train a GAN
      9. An example of a GAN–emulating MNIST handwritten digits
    2. GAN Python tools and libraries
      1. Neural network vulnerabilities
      2. Deep neural network attacks
      3. Adversarial attack methodologies
      4. Adversarial attack transferability
      5. Defending against adversarial attacks
      6. CleverHans library of adversarial examples
      7. EvadeML-Zoo library of adversarial examples
    3. Network attack via model substitution
      1. Substitute model training
      2. Generating the synthetic dataset
      3. Fooling malware detectors with MalGAN
    4. IDS evasion via GAN
      1. Introducing IDSGAN
      2. Features of IDSGAN
      3. The IDSGAN training dataset
      4. Generator network
      5. Discriminator network
      6. Understanding IDSGAN's algorithm training
    5. Facial recognition attacks with GAN
      1. Facial recognition vulnerability to adversarial attacks
      2. Adversarial examples against FaceNet
      3. Launching the adversarial attack against FaceNet's CNN
    6. Summary
  17. Section 4: Evaluating and Testing Your AI Arsenal
  18. Evaluating Algorithms
    1. Best practices of feature engineering
      1. Better algorithms or more data?
      2. The very nature of raw data
      3. Feature engineering to the rescue
      4. Dealing with raw data
        1. Data binarization
        2. Data binning
        3. Logarithmic data transformation
      5. Data normalization
        1. Min–max scaling
        2. Variance scaling
      6. How to manage categorical variables
        1. Ordinal encoding
        2. One-hot encoding
        3. Dummy encoding
      7. Feature engineering examples with sklearn
        1. Min–max scaler
        2. Standard scaler
        3. Power transformation
        4. Ordinal encoding with sklearn
        5. One-hot encoding with sklearn
    2. Evaluating a detector's performance with ROC
      1. ROC curve and AUC measure
        1. Examples of ROC metrics
        2. ROC curve example
        3. AUC score example
        4. Brier score example
    3. How to split data into training and test sets
      1. Algorithm generalization error
      2. Algorithm learning curves
    4. Using cross validation for algorithms
      1. K-folds cross validation pros and cons
      2. K-folds cross validation example
    5. Summary
  19. Assessing your AI Arsenal
    1. Evading ML detectors
      1. Understanding RL
      2. RL feedback and state transition
      3. Evading malware detectors with RL
      4. Black-box attacks with RL
    2. Challenging ML anomaly detection
      1. Incident response and threat mitigation
      2. Empowering detection systems with human feedback
    3. Testing for data and model quality
      1. Assessing data quality
      2. Biased datasets
      3. Unbalanced and mislabeled datasets
      4. Missing values in datasets
      5. Missing values example
      6. Assessing model quality
      7. Fine-tuning hyperparameters
      8. Model optimization with cross validation
    4. Ensuring security and reliability
      1. Ensuring performance and scalability
      2. Ensuring resilience and availability
      3. Ensuring confidentiality and privacy
    5. Summary
  20. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Hands-On Artificial Intelligence for Cybersecurity
  • Author(s): Alessandro Parisi
  • Release date: August 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789804027