Trill: The crown jewel of Microsoft’s streaming pipeline explained

Video description

The Trill data engine is the power behind many of Microsoft’s offerings, from products like Azure Stream Analytics to billion-dollar services like Bing Ads. It has now been open-sourced and is available to everyone. But it has been a long path to get there.

James Terwilliger, Badrish Chandramouli, and Jonathan Goldstein explore the history of decades of streaming data processing at Microsoft: a beginning in research, a first product in StreamInsight, the transition to the cloud, and all the pain points along the way. A key result of that lineage and learning has been the Trill engine, which has three key properties a single standalone data processing engine for all temporal data, no matter if the data is streamed or stored; a simple API that integrates seamlessly with the programming language; and performance without ego, a willingness to use every lesson learned to improve throughput in every way possible.

They dive deep into why each of those properties is important through examples. A simple application to demonstrate the basics of Trill: joins, aggregation, windowing; a more complicated application to demonstrate the power of Trill’s API: progressive windowing, regular expressions and pattern detection, data-dependent windows; and an overview of the kind of query used by Bing Ads, a query to run a multi-billion-dollar business.

You’ll see a performance showcase: running the previous examples to demonstrate how Trill got its name—processing a trillion events per day on a single node.

Prerequisite knowledge

  • A working knowledge of streaming data systems (useful but not required)

What you'll learn

  • Learn about temporal data versus temporal queries and data-dependent and custom temporal windowing

This session is from the 2019 O'Reilly Strata Conference in New York, NY.

Product information

  • Title: Trill: The crown jewel of Microsoft’s streaming pipeline explained
  • Author(s): James Terwilliger, Badrish Chandramouli, Jonathan Goldstein
  • Release date: February 2020
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 0636920371922