Machine Learning with the Elastic Stack

Book description

Leverage Elastic Stack's machine learning features to gain valuable insight from your data

Key Features

  • Combine machine learning with the analytic capabilities of Elastic Stack
  • Analyze large volumes of search data and gain actionable insight from them
  • Use external analytical tools with your Elastic Stack to improve its performance

Book Description

Machine Learning with the Elastic Stack is a comprehensive overview of the embedded commercial features of anomaly detection and forecasting. The book starts with installing and setting up Elastic Stack. You will perform time series analysis on varied kinds of data, such as log files, network flows, application metrics, and financial data.

As you progress through the chapters, you will deploy machine learning within the Elastic Stack for logging, security, and metrics. In the concluding chapters, you will see how machine learning jobs can be automatically distributed and managed across the Elasticsearch cluster and made resilient to failure.

By the end of this book, you will understand the performance aspects of incorporating machine learning within the Elastic ecosystem and create anomaly detection jobs and view results from Kibana directly.

What you will learn

  • Install the Elastic Stack to use machine learning features
  • Understand how Elastic machine learning is used to detect a variety of anomaly types
  • Apply effective anomaly detection to IT operations and security analytics
  • Leverage the output of Elastic machine learning in custom views, dashboards, and proactive alerting
  • Combine your created jobs to correlate anomalies of different layers of infrastructure
  • Learn various tips and tricks to get the most out of Elastic machine learning

Who this book is for

If you are a data professional eager to gain insight on Elasticsearch data without having to rely on a machine learning specialist or custom development, Machine Learning with the Elastic Stack is for you. Those looking to integrate machine learning within their search and analytics applications will also find this book very useful. Prior experience with the Elastic Stack is needed to get the most out of this book.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Machine Learning with the Elastic Stack
  3. Dedication
  4. About Packt
    1. Why subscribe?
  5. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  7. Machine Learning for IT
    1. Overcoming the historical challenges
      1. The plethora of data
      2. The advent of automated anomaly detection
    2. Theory of operation
      1. Defining unusual
      2. Learning normal, unsupervised
        1. Probability models
        2. Learning the models
        3. De-trending
        4. Scoring of unusualness
    3. Operationalization
      1. Jobs
      2. ML nodes
      3. Bucketization
      4. The datafeed
    4. Supporting indices
      1. .ml-state
      2. .ml-notifications
      3. .ml-anomalies-*
    5. The orchestration
    6. Summary
  8. Installing the Elastic Stack with Machine Learning
    1. Installing the Elastic Stack
      1. Downloading the software
      2. Installing Elasticsearch
      3. Installing Kibana
      4. Enabling Platinum features
    2. A guided tour of Elastic ML features
      1. Getting data for analysis
      2. ML job types in Kibana
        1. Data Visualizer
        2. The Single metric job
        3. Multi-metric job
        4. Population job
        5. Advanced job
      3. Controlling ML via the API
    3. Summary
  9. Event Change Detection
    1. How to understand the normal rate of occurrence
    2. Exploring count functions
      1. Summarized counts
      2. Splitting the counts
      3. Other counting functions
        1. Non-zero count
        2. Distinct count
    3. Counting in population analysis
    4. Detecting things that rarely occur
    5. Counting message-based logs via categorization
      1. Types of messages that can be categorized by ML
      2. The categorization process
      3. Counting the categories
      4. Putting it all together
      5. When not to use categorization
    6. Summary
  10. IT Operational Analytics and Root Cause Analysis
    1. Holistic application visibility
      1. The importance and limitations of KPIs
      2. Beyond the KPIs
    2. Data organization
      1. Effective data segmentation
        1. Custom queries for ML jobs
        2. Data enrichment on ingest
      2. Leveraging the contextual information
        1. Analysis splits
        2. Statistical influencers
    3. Bringing it all together for root cause analysis
      1. Outage background
      2. Visual correlation and shared influencers
    4. Summary
  11. Security Analytics with Elastic Machine Learning
    1. Security in the field
      1. The volume and variety of data
      2. The geometry of an attack
    2. Threat hunting architecture
      1. Layer-based ingestion
      2. Threat intelligence
    3. Investigation analytics
      1. Assessment of compromise
    4. Summary
  12. Alerting on ML Analysis
    1. Results presentation
    2. The results index
      1. Bucket results
      2. Record results
      3. Influencer results
    3. Alerts from the Machine Learning UI in Kibana
      1. Anatomy of the default watch from the ML UI in Kibana
    4. Creating ML alerts manually
    5. Summary
  13. Using Elastic ML Data in Kibana Dashboards
    1. Visualization options in Kibana
      1. Visualization examples
      2. Timelion
      3. Time series visual builder
    2. Preparing data for anomaly detection analysis
      1. The dataset
      2. Ingesting the data
      3. Creating anomaly detection jobs
        1. Global traffic analysis job
        2. A HTTP response code profiling of the host making requests
        3. Traffic per host analysis
    3. Building the visualizations
      1. Configuring the index pattern
      2. Using ML data in TSVB
      3. Creating a correlation Heat Map
      4. Using ML data in Timelion
      5. Building the dashboard
    4. Summary
  14. Using Elastic ML with Kibana Canvas
    1. Introduction to Canvas
      1. What is Canvas?
      2. The Canvas expression
    2. Building Elastic ML Canvas slides
      1. Preparing your data
      2. Anomalies in a Canvas data table
      3. Using the new SQL integration
    3. Summary
  15. Forecasting
    1. Forecasting versus prophesying
    2. Forecasting use cases
    3. Forecasting – theory of operation
    4. Single time series forecasting
      1. Dataset preparation
      2. Creating the ML job for forecasting
    5. Forecast results
    6. Multiple time series forecasting
    7. Summary
  16. ML Tips and Tricks
    1. Job groups
    2. Influencers in split versus non-split jobs
    3. Using ML on scripted fields
    4. Using one-sided ML functions to your advantage
    5. Ignoring time periods
      1. Ignoring an upcoming (known) window of time
        1. Creating a calendar event
        2. Stopping and starting a datafeed to ignore the desired timeframe
      2. Ignoring an unexpected window of time, after the fact
        1. Clone the job and re-run historical data
        2. Revert the model snapshot
    6. Don't over-engineer the use case
    7. ML job throughput considerations
    8. Top-down alerting by leveraging custom rules
    9. Sizing ML deployments
    10. Summary
  17. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Machine Learning with the Elastic Stack
  • Author(s): Rich Collier, Bahaaldine Azarmi
  • Release date: January 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781788477543