Book description
Learn data science by doing data science!
Data Science Using Python and R will get you plugged into the world’s two most widespread opensource platforms for data science: Python and R.
Data science is hot. Bloomberg called data scientist “the hottest job in America.” Python and R are the top two opensource data science tools in the world. In Data Science Using Python and R, you will learn stepbystep how to produce handson solutions to realworld business problems, using stateoftheart techniques.
Data Science Using Python and R is written for the general reader with no previous analytics or programming experience. An entire chapter is dedicated to learning the basics of Python and R. Then, each chapter presents stepbystep instructions and walkthroughs for solving data science problems using Python and R.
Those with analytics experience will appreciate having a onestop shop for learning how to do data science using Python and R. Topics covered include data preparation, exploratory data analysis, preparing to model the data, decision trees, model evaluation, misclassification costs, naïve Bayes classification, neural networks, clustering, regression modeling, dimension reduction, and association rules mining.
Further, exciting new topics such as random forests and general linear models are also included. The book emphasizes datadriven error costs to enhance profitability, which avoids the common pitfalls that may cost a company millions of dollars.
Data Science Using Python and R provides exercises at the end of every chapter, totaling over 500 exercises in the book. Readers will therefore have plenty of opportunity to test their newfound data science skills and expertise. In the Handson Analysis exercises, readers are challenged to solve interesting business problems using realworld data sets.
Table of contents
 COVER
 PREFACE
 ABOUT THE AUTHORS
 ACKNOWLEDGMENTS
 Chapter 1: INTRODUCTION TO DATA SCIENCE
 Chapter 2: THE BASICS OF PYTHON AND R
 Chapter 3: DATA PREPARATION
 Chapter 4: EXPLORATORY DATA ANALYSIS
 Chapter 5: PREPARING TO MODEL THE DATA
 Chapter 6: DECISION TREES

Chapter 7: MODEL EVALUATION
 7.1 INTRODUCTION TO MODEL EVALUATION
 7.2 CLASSIFICATION EVALUATION MEASURES
 7.3 SENSITIVITY AND SPECIFICITY
 7.4 PRECISION, RECALL, AND Fβ SCORES
 7.5 METHOD FOR MODEL EVALUATION
 7.6 AN APPLICATION OF MODEL EVALUATION
 7.7 ACCOUNTING FOR UNEQUAL ERROR COSTS
 7.8 COMPARING MODELS WITH AND WITHOUT UNEQUAL ERROR COSTS
 7.9 DATA‐DRIVEN ERROR COSTS
 EXERCISES
 Chapter 8: NAÏVE BAYES CLASSIFICATION

Chapter 9: NEURAL NETWORKS
 9.1 INTRODUCTION TO NEURAL NETWORKS
 9.2 THE NEURAL NETWORK STRUCTURE
 9.3 CONNECTION WEIGHTS AND THE COMBINATION FUNCTION
 9.4 THE SIGMOID ACTIVATION FUNCTION
 9.5 BACKPROPAGATION
 9.6 AN APPLICATION OF A NEURAL NETWORK MODEL
 9.7 INTERPRETING THE WEIGHTS IN A NEURAL NETWORK MODEL
 9.8 HOW TO USE NEURAL NETWORKS IN R
 REFERENCES
 EXERCISES
 Chapter 10: CLUSTERING

Chapter 11: REGRESSION MODELING
 11.1 THE ESTIMATION TASK
 11.2 DESCRIPTIVE REGRESSION MODELING
 11.3 AN APPLICATION OF MULTIPLE REGRESSION MODELING
 11.4 HOW TO PERFORM MULTIPLE REGRESSION MODELING USING PYTHON
 11.5 HOW TO PERFORM MULTIPLE REGRESSION MODELING USING R
 11.6 MODEL EVALUATION FOR ESTIMATION
 11.7 STEPWISE REGRESSION
 11.8 BASELINE MODELS FOR REGRESSION
 REFERENCES
 EXERCISES

Chapter 12: DIMENSION REDUCTION
 12.1 THE NEED FOR DIMENSION REDUCTION
 12.2 MULTICOLLINEARITY
 12.3 IDENTIFYING MULTICOLLINEARITY USING VARIANCE INFLATION FACTORS
 12.4 PRINCIPAL COMPONENTS ANALYSIS
 12.5 AN APPLICATION OF PRINCIPAL COMPONENTS ANALYSIS
 12.6 HOW MANY COMPONENTS SHOULD WE EXTRACT?
 12.7 PERFORMING PCA WITH k = 4
 12.8 VALIDATION OF THE PRINCIPAL COMPONENTS
 12.9 HOW TO PERFORM PRINCIPAL COMPONENTS ANALYSIS USING PYTHON
 12.10 HOW TO PERFORM PRINCIPAL COMPONENTS ANALYSIS USING R
 12.11 WHEN IS MULTICOLLINEARITY NOT A PROBLEM?
 REFERENCES
 EXERCISES
 Chapter 13: GENERALIZED LINEAR MODELS
 Chapter 14: ASSOCIATION RULES
 APPENDIX DATA SUMMARIZATION AND VISUALIZATION
 INDEX
 END USER LICENSE AGREEMENT
Product information
 Title: Data Science Using Python and R
 Author(s):
 Release date: April 2019
 Publisher(s): Wiley
 ISBN: 9781119526810
You might also like
book
Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud
This is the eBook of the printed book and may not include any media, website access …
book
Data Mining and Predictive Analytics, 2nd Edition
Learn methods of data analysis and their application to realworld data sets This updated second edition …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
HandsOn Data Science for Marketing
Optimize your marketing strategies through analytics and machine learning Key Features Understand how data science drives …