Chapter 5. The Data Analytics Stack
By this point in the book, you are well versed in the key principles and methods of analytics, having learned them in Excel. This chapter serves as an interlude to the following parts of the book, where you’ll pivot that existing knowledge into using R and Python.
This chapter will further delineate the disciplines of statistics, data analytics, and data science, and we’ll take a deep dive into how Excel, R, and Python all play into what I call the data analytics stack.
Statistics Versus Data Analytics Versus Data Science
The focus of this book is helping you master principles of data analytics. But as you’ve seen, statistics is so core to analytics that it’s often hard to delineate where one field ends and the other begins. To compound the confusion, you may also be interested in how data science fits into the mix. Let’s take a moment to tighten these distinctions.
Statistics is foremost concerned with the methods for collecting, analyzing, and presenting data. We’ve borrowed a lot from the field: for example, we made inferences about a population given a sample, and we depicted distributions and relationships in the data using charts like histograms and scatterplots.
Most of the tests and techniques we’ve used so far come from statistics, such as linear regression and the independent samples t-test. What distinguishes data analytics from statistics is not necessarily the means, but the ends.
With data analytics, ...