Learning Path: Real-Time Data Applications

Video Description

There are a variety of useful applications for real-time data, including quick identification of general patterns and trends in data, performing sentiment analysis, crafting responses in real-time, and—perhaps one of the most important uses—when having analysis immediately will change the outcome of the situation. This Learning Path provides an in-depth tour of technologies used in processing and analyzing real-time data.

Table of Contents

  1. Introduction To Cassandra
    1. Introducing The Course 00:04:41
    2. Understanding What Cassandra Is 00:04:58
    3. Learning What Cassandra Is Being Used For 00:04:56
    4. Understanding The System Requirements 00:06:54
    5. Opening The Main Virtual Machine 00:02:53
    6. Pop Quiz - Intro to Cassandra 00:01:24
  2. Getting Started With The Architecture
    1. Understanding That Cassandra Is A Distributed Database 00:02:23
    2. Learning What Snitch Is For 00:03:53
    3. Learning What Gossip Is For 00:01:52
    4. Learning How Data Gets Distributed 00:05:35
    5. Learning About Replication 00:02:12
    6. Learning About Virtual Nodes 00:03:01
    7. Pop Quiz - Getting Started with Architecture 00:01:25
  3. Installing Cassandra
    1. Downloading Cassandra 00:02:48
    2. Ensuring Oracle Java 7 Is Installed 00:02:02
    3. Installing Cassandra 00:03:44
    4. Viewing The Main Configuration File 00:02:46
    5. Providing Cassandra With Permission To Directories 00:01:46
    6. Starting Cassandra 00:03:41
    7. Checking Status 00:04:00
    8. Accessing The Cassandra system.log File 00:02:06
    9. Pop Quiz - Installing Cassandra 00:01:28
  4. Communicating With Cassandra
    1. Understanding Ways To Communicate With Cassandra 00:03:47
    2. Using CQLSH 00:02:29
    3. Pop Quiz - Communicating with Cassandra 00:01:08
  5. Creating A Database
    1. Understanding A Cassandra Database 00:01:54
    2. Defining A Keyspace 00:04:57
    3. Deleting A Keyspace 00:00:52
    4. Pop Quiz - Creating a Database 00:01:53
    5. Lab: Create A Second Database 00:02:39
  6. Creating A Table
    1. Creating A Table 00:01:49
    2. Defining Columns And Data Types 00:02:48
    3. Defining A Primary Key 00:01:49
    4. Recognizing A Partition Key 00:02:44
    5. Specifying A Descending Clustering Order 00:03:02
    6. Pop Quiz - Creating a Table 00:01:54
    7. Lab: Create A Second Table 00:02:33
  7. Inserting Data
    1. Understanding Ways To Write Data 00:01:28
    2. Using The INSERT INTO Command 00:04:45
    3. Using The COPY Command 00:05:53
    4. How Data Is Stored In Cassandra 00:04:21
    5. How Data Is Stored On Disk 00:05:29
    6. Pop Quiz - Inserting Data 00:02:15
    7. Lab: Insert Data 00:09:10
  8. Modeling Data
    1. Understanding Data Modeling In Cassandra 00:01:21
    2. Using A WHERE Clause 00:04:17
    3. Understanding Secondary Indexes 00:02:18
    4. Creating A Secondary Index 00:01:38
    5. Defining A Composite Partition Key 00:09:34
    6. Pop Quiz - Modeling Data 00:03:34
  9. Creating An Application
    1. Understanding Cassandra Drivers 00:02:31
    2. Exploring The DataStax Java Driver 00:03:14
    3. Setting Up A Development Environment 00:04:04
    4. Creating An Application Page 00:04:51
    5. Acquiring The DataStax Java Driver Files 00:03:24
    6. Getting The DataStax Java Driver Files Through Maven 00:02:23
    7. Providing The DataStax Java Driver Files Manually 00:02:36
    8. Connecting To A Cassandra Cluster 00:03:39
    9. Executing A Query 00:07:47
    10. Displaying Query Results - Part 1 00:05:59
    11. Displaying Query Results - Part 2 00:07:20
    12. Using An MVC Pattern 00:04:59
    13. Pop Quiz - Creating an Application 00:02:50
    14. Lab: Create A Second Application - Part 1 00:05:20
    15. Lab: Create A Second Application - Part 2 00:09:49
    16. Lab: Create A Second Application - Part 3 00:03:08
  10. Updating And Deleting Data
    1. Updating Data 00:03:39
    2. Understanding How Updating Works 00:03:55
    3. Deleting Data 00:07:10
    4. Understanding Tombstones 00:07:18
    5. Using TTLs 00:05:09
    6. Updating A TTL 00:02:38
    7. Pop Quiz - Updating and Deleting Data 00:02:38
    8. Lab: Update And Delete Data 00:07:00
  11. Selecting Hardware
    1. Understanding Hardware Choices 00:00:30
    2. Understanding RAM And CPU Recommendations 00:02:45
    3. Selecting Storage 00:04:08
    4. Deploying In The Cloud 00:04:07
    5. Pop Quiz - Selecting Hardware 00:02:06
  12. Adding Nodes To A Cluster
    1. Understanding Cassandra Nodes 00:03:39
    2. Having A Network Connection - Part 1 00:05:35
    3. Having A Network Connection - Part 2 00:05:02
    4. Having A Network Connection - Part 3 00:04:46
    5. Specifying The IP Address Of A Node In Cassandra 00:04:12
    6. Specifying Seed Nodes 00:06:30
    7. Bootstrapping A Node 00:06:18
    8. Cleaning Up A Node 00:02:59
    9. Using cassandra-stress 00:10:33
    10. Pop Quiz - Adding Nodes to a Cluster 00:01:39
    11. Lab: Add A Third Node 00:10:42
  13. Monitoring A Cluster
    1. Understanding Cassandra Monitoring Tools 00:00:46
    2. Using Nodetool 00:04:54
    3. Using JConsole 00:03:24
    4. Learning About OpsCenter 00:03:24
    5. Pop Quiz - Monitoring a Cluster 00:01:49
  14. Repairing Nodes
    1. Understanding Repair 00:05:17
    2. Repairing Nodes 00:04:17
    3. Understanding Consistency - Part 1 00:06:26
    4. Understanding Consistency - Part 2 00:04:33
    5. Understanding Hinted Handoff 00:03:30
    6. Understanding Read Repair 00:01:58
    7. Pop Quiz - Repairing Nodes 00:03:30
    8. Lab: Repair Nodes For A Keyspace 00:05:45
  15. Removing A Node
    1. Understanding Removing A Node 00:00:54
    2. Decommissioning A Node 00:04:36
    3. Putting A Node Back Into Service 00:06:38
    4. Removing A Dead Node 00:06:42
    5. Pop Quiz - Removing a Node 00:04:10
    6. Lab: Put A Node Back Into Service 00:05:00
  16. Redefining A Cluster For Multiple Data Centers
    1. Redefining For Multiple Data Centers - Part 1 00:04:50
    2. Redefining For Multiple Data Centers - Part 2 00:05:59
    3. Changing Snitch Type 00:05:25
    4. Modifying cassandra-rackdc.properties 00:07:45
    5. Changing Replication Strategy - Part 1 00:05:55
    6. Changing Replication Strategy - Part 2 00:03:58
    7. Pop Quiz - Redefining a Cluster 00:02:30
  17. Resources For FurTher Learning
    1. Accessing Documentation 00:02:51
    2. Reading Blogs And Books 00:04:53
    3. Watching Video Recordings 00:04:05
    4. Posting Questions 00:04:10
    5. Attending Events 00:03:00
    6. Wrap Up 00:01:03
    7. The Case for Kafka 00:11:23
    8. The Basics 00:09:10
    9. Setting up a Kafka Cluster 00:15:30
    10. Writing a Kafka Producer 00:14:33
    11. Writing a Kafka Consumer 00:16:34
    12. Using Kafka from Python 00:08:03
    13. Troubleshooting Kafka 00:29:29
    14. Integrating Kafka and Hadoop with Flafka 00:26:06
    15. Kafka Availability and Consistency 00:22:38
    16. Kafka Ecosystem 00:13:13
    17. Future of Kafka 00:08:53
    18. Pre-Flight Check 00:13:08
    19. Spark Deconstructed 00:14:31
    20. A Brief History 00:23:28
    21. Simple Spark Apps 00:25:07
    22. Spark Essentials 00:35:18
    23. Spark Examples 00:21:55
    24. Unifying the Pieces - Spark SQL 00:24:07
    25. Unifying the Pieces - Spark Streaming 00:14:48
    26. Unifying the Pieces - MLlib and GraphX 00:20:00
    27. Unified Workflows Demo 00:22:35
    28. The Full SDLC 00:04:01
    29. Developer Certification 00:06:10
    30. Resources 00:04:44
    31. Introduction - Why DataFrames? 00:02:28
    32. ETL to Prepare the Data from Capital Bikeshare 00:02:46
    33. Create a DataFrame, Explore using SQL 00:02:47
    34. Data Preparation for Machine Learning Models 00:05:33
    35. Build a Classifier Using Naive Bayes 00:04:43
    36. Build a Classifier Using Decision Trees 00:02:26
    37. Build a Classifier Using Random Forests 00:02:20
    38. Use a DataFrame to Compare Models 00:04:15
    39. Parquet as a Best Practice with DataFrames 00:00:58
    40. How to Store a DataFrame with Parquet 00:03:25
    41. How to Read a DataFrame Back in From Parquet 00:02:57
    42. Use SQL to Estimate Route Durations 00:01:41
    43. Data Preparation for GraphX - Model Route Costs 00:04:43
    44. Use PageRank to Rank Popular Stations 00:03:14
    45. Optimize Routes to Columbus Circle 00:03:43
    46. Compare Results with Google Maps 00:01:58
    47. Analyze a Popular Tourist Route 00:02:30
    48. Examples of How to Use DataFrames in Python 00:02:57
    49. Summary - The New DataFrames Features in Spark 00:01:03
    50. Introduction - Large-scale real time stream processing and analytics at Strata+Hadoop World - Ben Lorica 00:01:08
    51. Going Real-time: Data Collection and Stream Processing with Apache Kafka - Jay Kreps 00:39:29
    52. Say goodbye to batch - Tyler Akidau (Google) 00:42:35
    53. Stream Processing Everywhere - What to Use? - Jim Scott 00:39:06
    54. From Source to Solution: Building A System for Machine and Event-Oriented Data - Eric Sammer 00:41:59
    55. Spark Streaming - The State of the Union, and Beyond - Tathagata Das 00:36:46
    56. Dynamic Events in Massive Data Streams, from Astrophysics to Marketing Automation - Kirk Borne 00:40:06
    57. TSAR (the TimeSeries AggregatoR) - How to Count Tens of Billions of Daily Events in Real Time Using Open Source Technologies - Anirudh Todi 00:41:28
    58. Streaming Analytics: It’s Not The Same Game - Subutai Ahmad 00:38:46
    59. Realtime Data Analysis Patterns - Mikio Braun (streamdrill) 00:39:24
    60. The IoT P2P Backbone - Bruno Fernandez-Ruiz 00:27:05
    61. Practical Methods for Identifying Anomalies That Matter in Large Datasets - Robert Grossman 00:36:43
  18. Introduction
    1. Introduction to Time Series Problems 00:09:58
  19. Kafka
    1. Kafka Architecture and Deployment 00:11:33
    2. Kafka Usage 00:03:42
  20. Spark
    1. Introduction to Spark 00:15:43
    2. Spark Architecture 00:12:02
  21. Spark Streaming
    1. Spark Streaming: Windows & Slides 00:08:35
    2. Spark Streaming: Ingestion Sources & Using Kafka 00:08:32
    3. Sparks Streaming: Operations on the Stream 00:01:30
  22. Cassandra
    1. Introduction to Cassandra 00:08:56
    2. Cassandra Basic Architecture 00:11:59
    3. Replication, High Availability and Multi Datacenter 00:14:06
    4. Cassandra Weather Website Example 00:11:46
    5. Cassandra Query Language (CQL) 00:18:00
    6. Cassandra Partitions & Clustering 00:08:22
    7. Cassandra Read and Write Path 00:12:17
    8. Working with Cassandra 00:06:32
    9. Cassandra Drivers and Access Patterns 00:10:37
  23. Spark and Cassandra
    1. Spark and Cassandra Architecture 00:12:00
    2. Analyzing Cassandra Data & Spark SQL 00:12:12
    3. Spark and Cassandra DataStax Enterprise 00:04:31
  24. Real World Use Cases
    1. Real World Use Cases: Streaming Problems 00:17:11
    2. Real World Use Cases: In-place Analytic Problems 00:10:58

Product Information

  • Title: Learning Path: Real-Time Data Applications
  • Author(s): Ben Lorica
  • Release date: November 2015
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491957882