Learning Elastic Stack 7.0 - Second Edition

Book description

A beginner's guide to storing, managing, and analyzing data with the updated features of Elastic 7.0

Key Features

  • Gain access to new features and updates introduced in Elastic Stack 7.0
  • Grasp the fundamentals of Elastic Stack including Elasticsearch, Logstash, and Kibana
  • Explore useful tips for using Elastic Cloud and deploying Elastic Stack in production environments

Book Description

The Elastic Stack is a powerful combination of tools that help in performing distributed search, analytics, logging, and visualization of data. Elastic Stack 7.0 encompasses new features and capabilities that will enable you to find unique insights into analytics using these techniques. This book will give you a fundamental understanding of what the stack is all about, and guide you in using it efficiently to build powerful real-time data processing applications.

The first few sections of the book will help you understand how to set up the stack by installing tools and exploring their basic configurations. You’ll then get up to speed with using Elasticsearch for distributed search and analytics, Logstash for logging, and Kibana for data visualization. As you work through the book, you will discover the technique of creating custom plugins using Kibana and Beats. This is followed by coverage of the Elastic X-Pack, a useful extension for effective security and monitoring. You’ll also find helpful tips on how to use Elastic Cloud and deploy Elastic Stack in production environments.

By the end of this book, you’ll be well-versed with fundamental Elastic Stack functionalities and the role of each component in the stack to solve different data processing problems.

What you will learn

  • Install and configure an Elasticsearch architecture
  • Solve the full-text search problem with Elasticsearch
  • Discover powerful analytics capabilities through aggregations using Elasticsearch
  • Build a data pipeline to transfer data from a variety of sources into Elasticsearch for analysis
  • Create interactive dashboards for effective storytelling with your data using Kibana
  • Secure, monitor, and use Elastic Stack’s alerting and reporting capabilities
  • Take applications to an on-premise or cloud-based production environment with Elastic Stack

Who this book is for

This book is for entry-level data professionals, software engineers, e-commerce developers, and full-stack developers who want to learn about Elastic Stack and understand how the real-time processing and search engine works for business analytics and enterprise search applications. Experience with Elastic Stack is not required, however knowledge of data warehousing and database concepts will be helpful.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Learning Elastic Stack 7.0 Second Edition
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the authors
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Section 1: Introduction to Elastic Stack and Elasticsearch
  7. Introducing Elastic Stack
    1. What is Elasticsearch, and why use it?
      1. Schemaless and document-oriented
      2. Searching capability
      3. Analytics
      4. Rich client library support and the REST API
      5. Easy to operate and easy to scale
      6. Near real-time capable
      7. Lightning–fast
      8. Fault-tolerant
    2. Exploring the components of the Elastic Stack
      1. Elasticsearch
      2. Logstash
      3. Beats
      4. Kibana
      5. X-Pack
        1. Security
        2. Monitoring
        3. Reporting
        4. Alerting
        5. Graph
        6. Machine learning
      6. Elastic Cloud
    3. Use cases of Elastic Stack
      1. Log and security analytics
      2. Product search
      3. Metrics analytics
      4. Web search and website search
    4. Downloading and installing
      1. Installing Elasticsearch
      2. Installing Kibana
    5. Summary
  8. Getting Started with Elasticsearch
    1. Using the Kibana Console UI
    2. Core concepts of Elasticsearch
      1. Indexes
      2. Types
      3. Documents
      4. Nodes
      5. Clusters
      6. Shards and replicas
      7. Mappings and datatypes
        1. Datatypes
          1. Core datatypes
          2. Complex datatypes
          3. Other datatypes
        2. Mappings
          1. Creating an index with the name catalog
          2. Defining the mappings for the type of product
      8. Inverted indexes
    3. CRUD operations
      1. Index API
        1. Indexing a document by providing an ID
        2. Indexing a document without providing an ID
      2. Get API
      3. Update API
      4. Delete API
    4. Creating indexes and taking control of mapping
      1. Creating an index
      2. Creating type mapping in an existing index
      3. Updating a mapping
    5. REST API overview
      1. Common API conventions
        1. Formatting the JSON response
        2. Dealing with multiple indexes
          1. Searching all documents in one index
          2. Searching all documents in multiple indexes
          3. Searching all the documents of a particular type in all indexes
    6. Summary
  9. Section 2: Analytics and Visualizing Data
  10. Searching - What is Relevant
    1. The basics of text analysis
      1. Understanding Elasticsearch analyzers
        1. Character filters
        2. Tokenizer
          1. Standard tokenizer
        3. Token filters
      2. Using built-in analyzers
        1. Standard analyzer
      3. Implementing autocomplete with a custom analyzer
    2. Searching from structured data
      1. Range query
        1. Range query on numeric types
        2. Range query with score boosting
        3. Range query on dates
      2. Exists query
      3. Term query
    3. Searching from the full text
      1. Match query
        1. Operator
        2. Minimum should match
        3. Fuzziness
      2. Match phrase query
      3. Multi match query
        1. Querying multiple fields with defaults
        2. Boosting one or more fields
        3. With types of multi match queries
    4. Writing compound queries
      1. Constant score query
      2. Bool query
        1. Combining OR conditions
        2. Combining AND and OR conditions
        3. Adding NOT conditions
    5. Modeling relationships
      1. has_child query
      2. has_parent query
      3. parent_id query
    6. Summary
  11. Analytics with Elasticsearch
    1. The basics of aggregations
      1. Bucket aggregations
      2. Metric aggregations
      3. Matrix aggregations
      4. Pipeline aggregations
    2. Preparing data for analysis
      1. Understanding the structure of the data
      2. Loading the data using Logstash
    3. Metric aggregations
      1. Sum, average, min, and max aggregations
        1. Sum aggregation
        2. Average aggregation
        3. Min aggregation
        4. Max aggregation
      2. Stats and extended stats aggregations
        1. Stats aggregation
        2. Extended stats aggregation
      3. Cardinality aggregation
    4. Bucket aggregations
      1. Bucketing on string data
        1. Terms aggregation
      2. Bucketing on numerical data
        1. Histogram aggregation
        2. Range aggregation
      3. Aggregations on filtered data
      4. Nesting aggregations
      5. Bucketing on custom conditions
        1. Filter aggregation
        2. Filters aggregation
      6. Bucketing on date/time data
        1. Date Histogram aggregation
          1. Creating buckets across time periods
          2. Using a different time zone
          3. Computing other metrics within sliced time intervals
          4. Focusing on a specific day and changing intervals
      7. Bucketing on geospatial data
        1. Geodistance aggregation
        2. GeoHash grid aggregation
    5. Pipeline aggregations
      1. Calculating the cumulative sum of usage over time
    6. Summary
  12. Analyzing Log Data
    1. Log analysis challenges
    2. Using Logstash
      1. Installation and configuration
        1. Prerequisites
        2. Downloading and installing Logstash
          1. Installing on Windows
          2. Installing on Linux
      2. Running Logstash
    3. The Logstash architecture
    4. Overview of Logstash plugins
      1. Installing or updating plugins
        1. Input plugins
        2. Output plugins
        3. Filter plugins
        4. Codec plugins
      2. Exploring plugins
        1. Exploring input plugins
          1. File
          2. Beats
          3. JDBC
          4. IMAP
        2. Output plugins
          1. Elasticsearch
          2. CSV
          3. Kafka
          4. PagerDuty
        3. Codec plugins
          1. JSON
          2. Rubydebug
          3. Multiline
        4. Filter plugins
    5. Ingest node
      1. Defining a pipeline
      2. Ingest APIs
        1. Put pipeline API
        2. Get pipeline API
        3. Delete pipeline API
        4. Simulate pipeline API
    6. Summary
  13. Building Data Pipelines with Logstash
    1. Parsing and enriching logs using Logstash
      1. Filter plugins
        1. CSV filter
        2. Mutate filter
        3. Grok filter
        4. Date filter
        5. Geoip filter
        6. Useragent filter
    2. Introducing Beats
      1. Beats by Elastic.co
        1. Filebeat
        2. Metricbeat
        3. Packetbeat
        4. Heartbeat
        5. Winlogbeat
        6. Auditbeat
        7. Journalbeat
        8. Functionbeat
      2. Community Beats
      3. Logstash versus Beats
    3. Filebeat
      1. Downloading and installing Filebeat
        1. Installing on Windows
        2. Installing on Linux
      2. Architecture
      3. Configuring Filebeat
        1. Filebeat inputs
        2. Filebeat general/global options
        3. Output configuration
        4. Logging
        5. Filebeat modules
    4. Summary
  14. Visualizing Data with Kibana
    1. Downloading and installing Kibana
      1. Installing on Windows
      2. Installing on Linux
      3. Configuring Kibana
    2. Preparing data
    3. Kibana UI
      1. User interaction
      2. Configuring the index pattern
      3. Discover
        1. Elasticsearch query string/Lucene query
        2. Elasticsearch DSL query
        3. KQL
      4. Visualize
        1. Kibana aggregations
          1. Bucket aggregations
          2. Metric
      5. Creating a visualization
      6. Visualization types
        1. Line, area, and bar charts
        2. Data tables
        3. Markdown widgets
        4. Metrics
        5. Goals
        6. Gauges
        7. Pie charts
        8. Co-ordinate maps
        9. Region maps
        10. Tag clouds
      7. Visualizations in action
        1. Response codes over time
        2. Top 10 requested URLs
        3. Bandwidth usage of the top five countries over time
        4. Web traffic originating from different countries
        5. Most used user agent
      8. Dashboards
        1. Creating a dashboard
        2. Saving the dashboard
        3. Cloning the dashboard
        4. Sharing the dashboard
    4. Timelion
      1. Timelion
      2. Timelion expressions
    5. Using plugins
      1. Installing plugins
      2. Removing plugins
    6. Summary
  15. Section 3: Elastic Stack Extensions
  16. Elastic X-Pack
    1. Installing Elasticsearch and Kibana with X-Pack
      1. Installation
      2. Activating X-Pack trial account
        1. Generating passwords for default users
    2. Configuring X-Pack
    3. Securing Elasticsearch and Kibana
      1. User authentication
      2. User authorization
      3. Security in action
        1. Creating a new user
          1. Deleting a user
          2. Changing the password
        2. Creating a new role
          1. Deleting or editing a role
        3. Document-level security or field-level security
        4. X-Pack security APIs
          1. User Management APIs
          2. Role Management APIs
    4. Monitoring Elasticsearch
      1. Monitoring UI
        1. Elasticsearch metrics
          1. Overview tab
          2. Nodes tab
          3. The Indices tab
    5. Alerting
      1. Anatomy of a watch
      2. Alerting in action
        1. Creating a new alert
          1. Threshold Alert
          2. Advanced Watch
        2. Deleting/deactivating/editing a watch
    6. Summary
  17. Section 4: Production and Server Infrastructure
  18. Running Elastic Stack in Production
    1. Hosting Elastic Stack on a managed cloud
      1. Getting up and running on Elastic Cloud
      2. Using Kibana
      3. Overriding configuration
      4. Recovering from a snapshot
    2. Hosting Elastic Stack on your own
      1. Selecting hardware
      2. Selecting an operating system
      3. Configuring Elasticsearch nodes
        1. JVM heap size
        2. Disable swapping
        3. File descriptors
        4. Thread pools and garbage collector
      4. Managing and monitoring Elasticsearch
      5. Running in Docker containers
      6. Special considerations while deploying to a cloud
        1. Choosing instance type
        2. Changing default ports; do not expose ports!
        3. Proxy requests
        4. Binding HTTP to local addresses
        5. Installing EC2 discovery plugin
        6. Installing the S3 repository plugin
        7. Setting up periodic snapshots
    3. Backing up and restoring
      1. Setting up a repository for snapshots
        1. Shared filesystem
      2. Cloud or distributed filesystems
      3. Taking snapshots
      4. Restoring a specific snapshot
    4. Setting up index aliases
      1. Understanding index aliases
      2. How index aliases can help
    5. Setting up index templates
      1. Defining an index template
      2. Creating indexes on the fly
    6. Modeling time series data
      1. Scaling the index with unpredictable volume over time
        1. Unit of parallelism in Elasticsearch
          1. The effect of the number of shards on the relevance score
          2. The effect of the number of shards on the accuracy of aggregations
      2. Changing the mapping over time
        1. New fields get added
        2. Existing fields get removed
      3. Automatically deleting older documents
      4. How index-per-timeframe solves these issues
        1. Scaling with index-per-timeframe
        2. Changing the mapping over time
        3. Automatically deleting older documents
    7. Summary
  19. Building a Sensor Data Analytics Application
    1. Introduction to the application
      1. Understanding the sensor-generated data
      2. Understanding the sensor metadata
      3. Understanding the final stored data
    2. Modeling data in Elasticsearch
      1. Defining an index template
      2. Understanding the mapping
    3. Setting up the metadata database
    4. Building the Logstash data pipeline
      1. Accepting JSON requests over the web
      2. Enriching the JSON with the metadata we have in the MySQL database
        1. The jdbc_streaming plugin
        2. The mutate plugin
          1. Moving the looked-up fields that are under lookupResult directly in JSON
          2. Combining the latitude and longitude fields under lookupResult as a location field
          3. Removing the unnecessary fields
      3. Store the resulting documents in Elasticsearch
    5. Sending data to Logstash over HTTP
    6. Visualizing the data in Kibana
      1. Setting up an index pattern in Kibana
      2. Building visualizations
        1. How does the average temperature change over time?
        2. How does the average humidity change over time?
        3. How do temperature and humidity change at each location over time?
        4. Can I visualize temperature and humidity over a map?
        5. How are the sensors distributed across departments?
      3. Creating a dashboard
    7. Summary
  20. Monitoring Server Infrastructure
    1. Metricbeat
      1. Downloading and installing Metricbeat
        1. Installing on Windows
        2. Installing on Linux
      2. Architecture
        1. Event structure
    2. Configuring Metricbeat
      1. Module configuration
        1. Enabling module configs in the modules.d directory
        2. Enabling module configs in the metricbeat.yml file
      2. General settings
      3. Output configuration
      4. Logging
    3. Capturing system metrics
      1. Running Metricbeat with the system module
      2. Specifying aliases
      3. Visualizing system metrics using Kibana
    4. Deployment architecture
    5. Summary
  21. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Learning Elastic Stack 7.0 - Second Edition
  • Author(s): Pranav Shukla, Sharath Kumar M N
  • Release date: May 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789954395