Book description
Get your statistics basics right before diving into the world of data science
About This Book
 No need to take a degree in statistics, read this book and get a strong statistics base for data science and realworld programs;
 Implement statistics in data science tasks such as data cleaning, mining, and analysis
 Learn all about probability, statistics, numerical computations, and more with the help of R programs
Who This Book Is For
This book is intended for those developers who are willing to enter the field of data science and are looking for concise information of statistics with the help of insightful programs and simple explanation. Some basic hands on R will be useful.
What You Will Learn
 Analyze the transition from a data developer to a data scientist mindset
 Get acquainted with the R programs and the logic used for statistical computations
 Understand mathematical concepts such as variance, standard deviation, probability, matrix calculations, and more
 Learn to implement statistics in data science tasks such as data cleaning, mining, and analysis
 Learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks
 Get comfortable with performing various statistical computations for data science programmatically
In Detail
Data science is an everevolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on.
This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks.
By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.
Style and approach
Step by step comprehensive guide with real world examples
Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.
Publisher resources
Table of contents
 Preface
 Transitioning from Data Developer to Data Scientist

Declaring the Objectives

Key objectives of data science
 Collecting data
 Processing data
 Exploring and visualizing data
 Analyzing the data and/or applying machine learning to the data

Deciding (or planning) based upon acquired insight
 Thinking like a data scientist
 Bringing statistics into data science

Common terminology
 Statistical population
 Probability
 False positives
 Statistical inference
 Regression
 Fitting
 Categorical data
 Classification
 Clustering
 Statistical comparison
 Coding
 Distributions
 Data mining
 Decision trees
 Machine learning
 Munging and wrangling
 Visualization
 D3
 Regularization
 Assessment
 Crossvalidation
 Neural networks
 Boosting
 Lift
 Mode
 Outlier
 Predictive modeling
 Big Data
 Confidence interval
 Writing
 Summary

Key objectives of data science
 A Developer's Approach to Data Cleaning
 Data Mining and the Database Developer
 Statistical Analysis for the Database Developer
 Database Progression to Database Regression
 Regularization for Database Improvement
 Database Development and Assessment
 Databases and Neural Networks
 Boosting your Database
 Database Classification using Support Vector Machines
 Database Structures and Machine Learning
Product information
 Title: Statistics for Data Science
 Author(s):
 Release date: November 2017
 Publisher(s): Packt Publishing
 ISBN: 9781788290678
You might also like
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
video
Statistics for Data Science and Business Analysis
Statistics you need in the office: Descriptive and inferential statistics, hypothesis testing, and regression analysis About …
book
Statistics Essentials For Dummies
Statistics Essentials For Dummies (9781119590309) was previously published as Statistics Essentials For Dummies (9780470618394). While this …
book
Head First Design Patterns, 2nd Edition
You know you don’t want to reinvent the wheel, so you look to design patterns—the lessons …