Skip to Content
Cloud Analytics with Google Cloud Platform
book

Cloud Analytics with Google Cloud Platform

by Sanket Thodge
April 2018
Beginner to intermediate content levelBeginner to intermediate
282 pages
9h 53m
English
Packt Publishing
Content preview from Cloud Analytics with Google Cloud Platform

The Dataflow programming model

Cloud Dataflow runner services execute various data processing jobs that are created using the Dataflow SDK in a programming model that simplifies large-scale data processing.

We have our code programming model divided in four major components:

  • Pipelines: Represents a single, repeatable job from start to finish
  • PCollections: Represents a set of data in your pipeline
  • Transforms: Performs processing on the elements of PCollection
  • I/O Sources and Sinks: Provides data source / data sink APIs for pipeline I/O

Let's discuss them one by one in the following topics.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science on the Google Cloud Platform

Data Science on the Google Cloud Platform

Valliappa Lakshmanan
Hands-On Machine Learning on Google Cloud Platform

Hands-On Machine Learning on Google Cloud Platform

Giuseppe Ciaburro, V Kishore Ayyadevara, Alexis Perrier, Bryan Fry, Antonio Gulli
Google Cloud Cookbook

Google Cloud Cookbook

Rui Santos Costa, Drew Hodun

Publisher Resources

ISBN: 9781788839686Other