Ideas | Data Tools

Ideas and resources related to data tools.

Video play
Timber stack.

Twitter's real-time data stack

In this excerpt, Karthik Ramasamy and Sijie Guo of Twitter discuss the operational experience of DistributedLog and Heron, two powerful real-time analytics tools that were open sourced by the company in early 2016.

Video play
Sequence of a race horse galloping. Photos taken by Eadweard Muybridge (died 1904), first published in 1887 at Philadelphia.

Let's get real: Acting on data in real time

Companies are differentiating themselves by acting on data in real time. But what does “real time” really mean? Jack Norris discusses the challenges of coordinating data flows, analysis, and integration at scale to shape business as it happens.

Video play
An armillary sphere in a painting by Sandro Botticelli, circa 1480.

Data modeling constructs and terminology

Identification of data sources is the first step in warehouse development. In this video training segment, Michael Blaha provides a framework by reviewing data modeling constructs and terminology, including dependent and independent entity types. Using IE (information engineering) notation and the ERwin tool, Blaha walks through a sample operational data model.

Video play
La Resurrezione ("The Resurrection") by Pericle Fazzini in Vatican Museum. Photo by Michal Osmenda.

Understanding YARN’s architecture and daemons

In the new O’Reilly video training "Introduction to Hadoop YARN," David Yahalom explains everything you need to know about using this new data processing platform to extend Hadoop’s potential. In this segment, Yahalom explains YARN’s architecture and daemons.