To create the Pachyderm pipeline on our deployed Pachyderm cluster, we first need to create the input repositories for the pipeline. Remember, our pipeline has the training and attributes data repositories that drive the rest of the pipeline. As we put data into these, Pachyderm will trigger the downstream portions of the pipeline and calculate the results.
We saw in Chapter 1, Gathering and Organizing Data, how we can create data repositories in Pachyderm and commit data into these repositories using pachctl, but let's try to do this directly from a Go program here. In this particular example, there is not any real advantage to doing this, due to the fact that we already have our training and test ...