Chapter 3Designing Data Pipelines

Google Cloud Professional Data Engineer Exam objectives covered in this chapter include the following:

  1. Designing data processing systems
    • ✔ 1.2 Designing data pipelines. Considerations include:
      • Data publishing and visualization (e.g., BigQuery)
      • Batch and streaming data (e.g., Cloud Dataflow, Cloud Dataproc, Apache Beam, Apache Spark and Hadoop ecosystem, Cloud Pub/Sub, Apache Kafka)
      • Online (interactive) vs. batch predictions
      • Job automation and orchestration (e.g., Cloud Composer)
  2. Building and operationalizing data processing systems
    • ✔ 2.2 Building and operationalizing pipelines. Considerations include:
      • Data cleansing
      • Batch and streaming
      • Transformation
      • Data acquisition and import
      • Integrating with new ...

Get Official Google Cloud Certified Professional Data Engineer Study Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.