Introduction
Statistics and Data Science
As of the writing of this book, the fields of statistics and data science are evolving rapidly to meet the changing needs of business, government, and research organizations. It is an oversimplification, but still useful, to think of two distinct communities as you proceed:
- The traditional academic and medical research communities that typically conduct extended research projects adhering to rigorous regulatory or publication standards, and
- Businesses and large organizations that use statistical methods to extract value from their data, often on the fly. Reliability and value are more important than academic rigor to this data science community.
Most users of statistical methods now fall in the second category, as those methods are a basic component of what is now called artificial intelligence (AI). However, most of the specific techniques, as well as the language of statistics, had their origin in the first group. As a result, there is a certain amount of “baggage” that is not truly relevant to the data science community. That baggage can sometimes be obscure or confusing and, in this book, we provide guidance on what is or is not important to data science. Another feature of this book is the use of resampling/simulation methods to develop the underpinnings of statistical inference (the most difficult topic in an introductory course) in a transparent and understandable fashion.
We start off with some examples of statistics in action ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access