Learn the techniques and math you need to start making sense of your data
About This Book
- Enhance your knowledge of coding with data science theory for practical insight into data science and analysis
- More than just a math class, learn how to perform real-world data science tasks with R and Python
- Create actionable insights and transform raw data into tangible value
Who This Book Is For
You should be fairly well acquainted with basic algebra and should feel comfortable reading snippets of R/Python as well as pseudo code. You should have the urge to learn and apply the techniques put forth in this book on either your own data sets or those provided to you. If you have the basic math skills but want to apply them in data science or you have good programming skills but lack math, then this book is for you.
What You Will Learn
- Get to know the five most important steps of data science
- Use your data intelligently and learn how to handle it with care
- Bridge the gap between mathematics and programming
- Learn about probability, calculus, and how to use statistical models to control and clean your data and drive actionable results
- Build and evaluate baseline machine learning models
- Explore the most effective metrics to determine the success of your machine learning models
- Create data visualizations that communicate actionable insights
- Read and apply machine learning concepts to your problems and make actual predictions
Need to turn your skills at programming into effective data science skills? Principles of Data Science is created to help you join the dots between mathematics, programming, and business analysis. With this book, you’ll feel confident about asking—and answering—complex and sophisticated questions of your data to move from abstract and raw statistics to actionable ideas.
With a unique approach that bridges the gap between mathematics and computer science, this books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques, you’ll move on to build a comprehensive picture of how every piece of the data science puzzle fits together. Learn the fundamentals of computational mathematics and statistics, as well as some pseudocode being used today by data scientists and analysts. You’ll get to grips with machine learning, discover the statistical models that help you take control and navigate even the densest datasets, and find out how to create powerful visualizations that communicate what your data means.
Style and approach
This is an easy-to-understand and accessible tutorial. It is a step-by-step guide with use cases, examples, and illustrations to get you well-versed with the concepts of data science. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts later on and will help you implement these techniques in the real world.
Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.
Table of contents
Principles of Data Science
- Table of Contents
- Principles of Data Science
- About the Author
- About the Reviewers
1. How to Sound Like a Data Scientist
- What is data science?
- The data science Venn diagram
- Some more terminology
- Data science case studies
2. Types of Data
- Flavors of data
- Why look at these distinctions?
- Structured versus unstructured data
- Quantitative versus qualitative data
- The road thus far…
The four levels of data
- The nominal level
- The ordinal level
- The interval level
- The ratio level
- Data is in the eye of the beholder
3. The Five Steps of Data Science
- Introduction to data science
- Overview of the five steps
- Explore the data
- 4. Basic Mathematics
- 5. Impossible or Improbable – A Gentle Introduction to Probability
6. Advanced Probability
- Collectively exhaustive events
- Bayesian ideas revisited
- Random variables
7. Basic Statistics
- What are statistics?
- How do we obtain and sample data?
- Sampling data
- How do we measure statistics?
- The Empirical rule
8. Advanced Statistics
- Point estimates
- Sampling distributions
- Confidence intervals
- Conducting a hypothesis test
- One sample t-tests
- Type I and type II errors
- Hypothesis test for categorical variables
9. Communicating Data
- Why does communication matter?
- Identifying effective and ineffective visualizations
- When graphs and statistics lie
- Verbal communication
- The why/how/what strategy of presenting
10. How to Tell If Your Toaster Is Learning – Machine Learning Essentials
- What is machine learning?
- Machine learning isn't perfect
- How does machine learning work?
- Types of machine learning
- How does statistical modeling fit into all of this?
- Linear regression
- Logistic regression
- Probability, odds, and log odds
- Dummy variables
11. Predictions Don't Grow on Trees – or Do They?
- Naïve Bayes classification
- Decision trees
- Unsupervised learning
- K-means clustering
- Choosing an optimal number for K and cluster validation
- Feature extraction and principal component analysis
12. Beyond the Essentials
- The bias variance tradeoff
- K folds cross-validation
- Grid searching
- Ensembling techniques
- Neural networks
13. Case Studies
- Case study 1 – predicting stock prices based on social media
- Case study 2 – why do some people cheat on their spouses?
- Case study 3 – using tensorflow
- Title: Principles of Data Science
- Release date: December 2016
- Publisher(s): Packt Publishing
- ISBN: 9781785887918
You might also like
Principles of Data Science - Second Edition
Learn the techniques and math you need to start making sense of your data Key Features …
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
Data Science, 2nd Edition
Learn the basics of Data Science through an easy to understand conceptual framework and immediately practice …
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …