Up to this point, we have only done the cleanup in our heads. I personally find this to be a much more rewarding exercise: to mentally clean up the data before actually cleaning up. This is not because I'm highly confident that I will have handled all the irregularities in the data. Instead, I like this process because it clarifies what needs to be done. And that in turn guides the data structures required for the job.
But, once the thinking is done, it's time to validate our thinking with code.
We start with the clean function:
// hints is a slice of bools indicating whether it's a categorical variablefunc clean(hdr []string, data [][]string, indices []map[string][]int, hints []bool, ignored []string) (int, int, []float64, ...