Chapter 7: Pachyderm Operations

In Chapter 6, Creating Your First Pipeline, we created our first pipeline, as well as learning how to create Pachyderm repositories, put data into a repository, create and run a pipeline, and view the results of the pipeline. We now know how to create a standard Pachyderm pipeline specification and include our scripts in it so that they can run against data in our input repository.

In this chapter, we will review all the different ways to put data inside of Pachyderm and export it to outside systems. We will learn how to update the code that runs inside of your pipeline and what the process of updating the pipeline specification is. We will learn how to build a Docker container and test it locally before uploading ...

Get Reproducible Data Science with Pachyderm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.