2

Environment Setup

In this chapter, we will outline two different environments for developing data engineering pipelines.

The first environment will use cloud-based tooling and services and will require no local environment setup. This is beneficial for multiple reasons. First, this type of environment is highly portable because you will be able to access it from any machine and any location as long as you have an internet connection and a browser. Second, it requires the least amount of setup to get started. The downside to this type of environment is that there are costs associated with another organization maintaining the systems you will be using for development.

The second environment will utilize your local machine to develop your pipeline ...

Get Data Engineering with Scala and Spark now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.