Skip to Content
Cloud Analytics with Google Cloud Platform
book

Cloud Analytics with Google Cloud Platform

by Sanket Thodge
April 2018
Beginner to intermediate content levelBeginner to intermediate
282 pages
9h 53m
English
Packt Publishing
Content preview from Cloud Analytics with Google Cloud Platform

PCollection (data)

We have two PCollection classes: specialized container classes and representing datasets of virtually unlimited size. A PCollection represents a pipeline's input, intermediate, and output data while supporting parallelized processing.

PCollections need to be created for any data to be worked upon. At the same time, we also have to understand that PCollections are immutable; that is, the elements of an existing PCollection cannot be changed. They also don’t support random access to individual elements. Every individual element belongs to a pipeline in which it is created. Hence, it cannot be shared between the pipeline objects. One PCollection can be generated from another PCollection after computation. We have a size factor ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science on the Google Cloud Platform

Data Science on the Google Cloud Platform

Valliappa Lakshmanan
Hands-On Machine Learning on Google Cloud Platform

Hands-On Machine Learning on Google Cloud Platform

Giuseppe Ciaburro, V Kishore Ayyadevara, Alexis Perrier, Bryan Fry, Antonio Gulli
Google Cloud Cookbook

Google Cloud Cookbook

Rui Santos Costa, Drew Hodun

Publisher Resources

ISBN: 9781788839686Other