Skip to Content
View all events

Data Superstream: Real-Time Analytics

Published by O'Reilly Media, Inc.

Beginner content levelBeginner

Enabling better and faster insights at the point of decision

Quickly collecting, analyzing, and visualizing data to make data-driven business decisions is imperative to building an effective data strategy—and to maintaining a competitive edge for your business. We can now do analytics that reflect data in near real time, enabling better and faster insights at the point of decision. We can also process and query new data faster than ever before, running hundreds of thousands of queries per second with added layers of cutting-edge analytical capabilities.

So what does this mean for your organization? The ability to unlock faster, more actionable insights for all in real time.

Join us to dive deeply into the modern world of real-time analytics. Data engineers, developers, and architects will explore the latest trends in real-time analytics—including modern data applications, streaming technologies, data architecture, and tools—straight from experts and practitioners successfully putting them to use in the field.

We’re still working on finalizing the schedule for this event. Please check back closer to the event date for more information.

About the Data Superstream Series: This three-part Superstream series is designed to help your organization maximize the business impact of your data. Each day covers different topics, with unique sessions lasting no more than four hours. And they’re packed with insights from key innovators and the latest tools and technologies to help you stay ahead of it all.

What you’ll learn and how you can apply it

  • Learn how to productize real-time analytics applications that will enable your organization to make informed decisions at the speed of business
  • Build real-time analytics applications and dashboards with Python that show up to date orders, revenue generation, and top-selling products
  • Explore powerful, event-driven streaming platforms that provide the "glue" that enables data to flow through disparate systems in dynamic fashion
  • See how the manufacturing sector is using real-time analytics to improve operational efficiency—from bottleneck resolution and predictive maintenance to inventory optimization
  • Explore various data sources, techniques, and tools that enable the effective implementation of real-time analytics
  • Discover how technologies including cloud, generative AI, ML, NLP, in-memory, microservices, and data timehouses are all completely reshaping industries and evolving real-time business at the frontier of innovation

This live event is for you because...

  • You’re a developer, architect, or data engineer exploring the latest trends and tools in real-time analytics.
  • You want to harness the power of real-time data processing and analysis to gain a competitive edge in the rapidly evolving world of data-driven decision-making.

Prerequisites

  • Come with your questions
  • Have a pen and paper handy to capture notes, insights, and inspiration

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Lorien Pratt: Introduction (5 minutes) - 8:00am PT | 11:00am ET | 3:00pm UTC/GMT

  • Lorien Pratt welcomes you to the Data Superstream.

Karin Wolok: Keynote–From Data to Dollars: Productizing Real-Time Analytics for Business Success (15 minutes) - 8:05am PT | 11:05am ET | 3:05pm UTC/GMT

  • In today's data-driven world, the ability to analyze and act on real-time data is critical to business success. But developing real-time analytics solutions can be challenging and time-consuming, with significant upfront costs and ongoing maintenance requirements. The key to success lies in productizing real-time analytics applications, transforming your business into a lean, data-driven machine. Karin Wolok explains the transformative power of productizing real-time analytics applications, enabling organizations to make informed decisions at the speed of business. Whether you’re a data analyst, business executive, or technology leader, you’ll gain inspiration to transform your business through the power of real-time analytics.
  • Karin Wolok is the founder of Project Elevate, a consultancy that helps startups build and grow their communities, but her past experience spans multiple industries and verticals. As a specialist in developer tools and databases, she played pivotal roles in building startups like StarTree and Neo4j, and she led marketing campaigns for renowned organizations like Eminem, Live Nation, and Novartis. Karin has presented at over 50 conferences globally, providing insights and guidance to the developer relations community.

Mark Needham and Dunith Dhanushka: The Real-Time Analytics Stack (30 minutes) - 8:20am PT | 11:20am ET | 3:20pm UTC/GMT

  • Real-time analytics is one of the new trends in the streaming space, but it can be hard to keep up with the pace of new product releases. Mark Needham and Dunith Dhanushka provide a map to help you understand where current and new tools fit into the space—and then they’ll use those tools to build a real-time analytics application for an online pizza delivery service so you can see how it’s done. They’ll start with Faker to generate users and orders, which will be loaded into MySQL and Redpanda, then publish products to a Redpanda stream and use Flink to join the orders and product streams into an enriched stream containing full order and product details. They’ll load the stream into the Apache Pinot OLAP database and use Streamlit to create a real-time dashboard that shows the latest orders, revenue generated, and top-selling products.
  • Mark Needham is an Apache Pinot advocate and developer relations engineer at StarTree who helps users learn how to harness Apache Pinot to build real-time user-facing analytics applications. He also simplifies the getting-started experience by making product tweaks and improvements to the documentation. Mark writes about his experiences working with Pinot at Markhneedham.com and tweets at @markhneedham.
  • Dunith Dhanushka is a senior developer advocate at Redpanda Data, where he directs his efforts toward expanding awareness of the Redpanda platform. A profound interest in big data led to his keen curiosity about event streaming platforms, real-time stream processing, ETL/ELT, data engineering, and event-driven architecture at scale. In his free time, Dunith enjoys studying new technologies in this field and sharing his insights on Eventdrivenutopia.com.

Sumeet Kumar Agrawal: Speed Up Business-Critical Streaming Analytics with an AI-Powered Data Stack (Sponsored by Informatica) (30 minutes) - 8:50am PT | 11:50am ET | 3:50pm UTC/GMT

  • Businesses today have an unprecedented opportunity to gain insight from a steady stream of real-time data—for example, transactions from databases, clickstreams from web servers, application and infrastructure log data, geolocation data, and data coming from remote sensors or agents. Sumeet Kumar Agrawal explains how to use Microsoft’s latest offering, Microsoft Fabric, and Informatica’s Intelligent Data Management Cloud (IDMC) to access these data sources and derive actionable business insights in real time. You’ll also discover how Claire, Informatica’s AI engine, can improve your self-service capabilities with personalized recommendations and boost productivity through intelligent automation. Join in to explore the transformative power of IDMC and Microsoft Fabric and bring your data to life.
  • Sumeet Kumar Agrawal is a VP of product management at Informatica, where he leads the cloud AI, analytics and data warehouse, and data lake product portfolio. Sumeet has 15+ years of data engineering and product management experience, driving innovative products within the cloud technology sector, as well as deep experience working with cloud ecosystem vendors like AWS, Google Cloud, Microsoft Azure, Snowflake, and Databricks. A strong communicator, he’s built strong and fruitful organizational teams.
  • This session will be followed by a 30-minute Q&A in a breakout room with Ajay Gollapalli, Informatica's ecosystem director for Microsoft Azure and Databricks. Stop by if you have more questions about Informatica.
  • Break (10 minutes)

Ben Gamble: A Match Made in Data—Matching Real-Time Demand with Analytics in Kafka Streams (30 minutes) - 9:30am PT | 12:30pm ET | 4:30pm UTC/GMT

  • People prefer not to wait for things, and that includes software. Whether they're looking for a game to play, a taxi home, or even a date, you have a finite amount of time to deliver your service before users give up on you. On the other hand, matching a user with an inappropriate service or product can actually be worse. Good matches rely on many factors—some being hard constraints and others that involve weighted scores. And these can vary in real time. Ben Gamble walks you through building a real-time analytics engine to drive matchmaking and product suggestions with Kafka Streams, leveraging Redis, ClickHouse, and Apache Cassandra. You’ll see how to ingest real-time data, capture user intent, aggregate past behavior, and emit actionable results in a real-time analytics pipeline and then blend that with transactional data to drive user actions. All while handling real-world problems such as flaky users, sudden traffic jams, or lost WiFi.
  • Ben Gamble spent over 10 years leading engineering at startups and high-growth companies. As a founder, CTO, producer, and product leader, he's bridged the gap between research and product development, and as a result of his work at the cutting edge of augmented reality, scaling 3D gaming, and same day logistics, he’s no stranger to technical challenges and the commercial realities they entail. However, now that he’s found a home in developer relations, he works to make real-time data a reality for anyone who needs it via open source tools and shared ideas.

Mary Grygleski: Supercharge Your Business Solutions with Event Streaming and Real-Time Analytics (30 minutes) - 10:00am PT | 1:00pm ET | 5:00pm UTC/GMT

  • Real-time data analytics relies on the super-fast ingestion of data made possible by leveraging powerful and efficient event-driven streaming platforms like Apache Pulsar, enabling data to flow through disparate systems in a dynamic fashion. Mary Grygleski demonstrates with a simple data flow example in which data ingested from Pulsar is streamed to another open-source-based analytics engine and then performs some basic queries using the SDK.
  • Mary Grygleski is a Java Champion and a passionate senior developer advocate at DataStax, a leading data management company that champions OSS and specializes in Big Data/NoSQL, streaming, managed data cloud platforms, and real-time AI systems. She has over 20 years of hands-on software engineering and technical architecture experience in Java and open source and is president of the 3,000-member Chicago Java Users Group.
  • Break (5 minutes)

Christopher Andrassy: Real-Time Analytics—Revolutionizing Operational Efficiency in Manufacturing (30 minutes) - 10:35am PT | 1:35pm ET | 5:35pm UTC/GMT

  • Unlocking the transformative potential of real-time analytics can greatly improve operational efficiency in the manufacturing sector. Addressing the industry's pressing challenges, Christopher Andrassy explores data sources, techniques, and tools that enable the effective implementation of real-time analytics in the industry, showcasing its practical applications and demonstrating its impact on areas such as bottleneck resolution, predictive maintenance, inventory optimization, and workforce management. You’ll learn the importance of collaboration between data professionals and the business, of ensuring data quality, and of adapting to the evolving landscape of manufacturing technology. You’ll leave understanding the need to embrace real-time analytics to drive growth and competitiveness—and how to take steps in this direction.
  • Chris Andrassy is an entrepreneur focused on transforming data into superior business outcomes on a global scale. He began his career at PwC, supporting the digital transformation of mature organizations struggling to innovate in a hyper-competitive world. After experiencing the limitations of traditional analytics practices, he began a new chapter as cofounder of Astral Insights, a Raleigh-based decision intelligence consulting firm delivering solutions to supercharge corporate decision-making. Chris is also an investor focused on innovative technologies including synthetic biology, sustainable energy, AI, and genomics. He’s an avid musician, skier, traveler, and fitness enthusiast.

Steve Wilcockson: The Prime Time-ification of Real Time Analytics—The Frontier of Innovation (30 minutes) - 11:05am PT | 2:05pm ET | 6:05pm UTC/GMT

  • Technologies including cloud, generative AI, ML, NLP, in-memory, microservices, and data timehouses are reshaping industries and evolving real-time business as the frontier of innovation. While elementary versions of these technologies have long existed, their capabilities are now accessible and simplified by vendors who have made integration and synchronicity an integral product strategy. Organizations that leverage the full scope of these technologies can better exploit the maximum potential of data to make smarter decisions, react more quickly to changes in the market, accelerate key processes, and create new data-driven strategies not possible until now. Join Steve Wilcockson to explore a unified, low-latency, and low-complexity stack that supports real-time business and data requirements by optimizing analytic workflows of streaming, vector, and matrix data, making real-time analytics achievable.
  • Steve Wilcockson enjoys model-led and data-driven technologies, particularly in financial services. At KX, Steve advocates for the world’s fastest kdb vector and time series database and analytics engine, particularly within data science and machine learning research to production workflows. Previously, Steve was a product marketer for Java runtime specialist Azul, where he worked in conjunction with popular open source streaming and big data technologies such as Kafka, Spark, and Cassandra; market development lead at “altdata” sustainable finance satellite imagery data provider Geospatial Insight; and financial services industry manager at MathWorks, a.k.a. MATLAB.

Karin Wolok and Yingjun Wu: Battle of the Stream Processing Titans—Flink Versus RisingWave (30 minutes) - 11:35am PT | 2:35pm ET | 6:35pm UTC/GMT

  • The world of real-time data processing is constantly evolving, with new technologies and platforms emerging to meet the ever-increasing demands of modern data-driven businesses. Apache Flink and RisingWave are two powerful stream processing solutions that have gained significant traction in recent years. But which platform is right for your organization? Karin Wolok and Yingjun Wu go head-to-head to compare and contrast the strengths and limitations of Flink and RisingWave. They’ll also share real-world use cases, best practices for optimizing performance and efficiency, and key considerations for selecting the right solution for your specific business needs.
  • Karin Wolok is the founder of Project Elevate, a consultancy that helps startups build and grow their communities, but her past experience spans multiple industries and verticals. As a specialist in developer tools and databases, she played pivotal roles in building startups like StarTree and Neo4j, and she led marketing campaigns for renowned organizations like Eminem, Live Nation, and Novartis. Karin has presented at over 50 conferences globally, providing insights and guidance to the developer relations community.
  • Yingjun Wu is the founder of RisingWave Labs, a database company developing RisingWave, a distributed SQL database for stream processing. Previously, Yingjun was a software engineer on Amazon Web Services’ Redshift team and a researcher at IBM Almaden Research Center. Yingjun received his PhD in computer science from National University of Singapore and was a visiting PhD student at Carnegie Mellon University. He’s been working in the field of stream processing and database systems for over a decade.

Lorien Pratt: Closing Remarks (5 minutes) - 12:05pm PT | 3:05pm ET | 7:05pm UTC/GMT

  • Lorien Pratt closes out today’s event.

Upcoming Data Superstream events:

  • The Data Engineering Lifecycle - October 4, 2023

Your Host

  • Lorien Pratt

    Computer scientist Dr. Lorien Pratt is cofounder of Quantellia and one of the pioneers of artificial intelligence. Recognized among Women Innovators and Inventors as the inventor of transfer learning, she continues to push the boundaries of technology as a creator of and evangelist for decision intelligence. She’s the author of Link: How Decision Intelligence Connects Data, Actions, and Outcomes for a Better World and The Decision Intelligence Handbook (O’Reilly, 2023). With appearances on CSPAN and NPR, DI leadership at Europe’s Earth Systems Predictability forum, plus many keynotes and podcasts, Lorien’s work in AI and DI has broad implications for “wicked,” invisible, exponential, multi-link problems—including climate, inequality, energy, and complex problems faced by commercial organizations in a changing world—whose mitigation or solutions depend on better decisions by billions of people. Pratt blogs at www.lorienpratt.com.

Skill covered

Real-Time Analytics

Sponsored by

  • Informatica  logo