Practical Data Science with R video edition

Video description

"A unique and important addition to any data scientist’s library."
Jim Porzak, Cofounder Bay Area R Users Group

Practical Data Science with R lives up to its name. It explains basic principles without the theoretical mumbo-jumbo and jumps right to the real use cases you'll face as you collect, curate, and analyze the data crucial to the success of your business. It shows you how to design experiments (such as A/B tests), build predictive models, and present results to audiences of all levels. You'll apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support.

Business analysts and developers are increasingly collecting, curating, analyzing, and reporting on crucial business data. The R language and its associated tools provide a straightforward way to tackle day-to-day data science tasks without a lot of academic theory or advanced mathematics.
Inside:

  • Data science for the business professional
  • Statistical analysis using the R language
  • Project lifecycle, from planning to delivery
  • Numerous instantly familiar use cases
  • Keys to effective data presentations
This book is accessible to readers without a background in data science. Some familiarity with basic statistics, R, or another scripting language is assumed.

Nina Zumel and John Mount are cofounders of a San Francisco-based data science consulting firm. Both hold PhDs from Carnegie Mellon and blog on statistics, probability, and computer science at win-vector.com.

Covers the process end-to-end, from data exploration to modeling to delivering the results.
Nezih Yigitbasi, Intel

Full of useful gems for both aspiring and experienced data scientists.
Fred Rahmanian, Siemens Healthcare

Hands-on data analysis with real-world examples. Highly recommended.
Dr. Kostas Passadis, IPTO

NARRATED BY JOSEF GAGNIER

Table of contents

  1. Chapter 1. The data science process
  2. Chapter 1. Stages of a data science project
  3. Chapter 1. Modeling
  4. Chapter 1. Setting expectations
  5. Chapter 2. Loading data into R
  6. Chapter 2. Using R on less-structured data
  7. Chapter 2. Working with relational databases
  8. Chapter 2. Loading data from a database into R
  9. Chapter 3. Exploring data
  10. Chapter 3. Typical problems revealed by data summaries
  11. Chapter 3. Spotting problems using graphics and visualization
  12. Chapter 3. Visually checking distributions for a single variable
  13. Chapter 3. Visually checking relationships between two variables
  14. Chapter 4. Managing data
  15. Chapter 4. Data transformations
  16. Chapter 4. Sampling for modeling and validation
  17. Chapter 5. Choosing and evaluating models
  18. Chapter 5. Solving scoring problems
  19. Chapter 5. Evaluating models
  20. Chapter 5. Evaluating scoring models
  21. Chapter 5. Evaluating probability models
  22. Chapter 5. Evaluating ranking models
  23. Chapter 5. Validating models
  24. Chapter 5. Ensuring model quality
  25. Chapter 6. Memorization methods
  26. Chapter 6. Building single-variable models
  27. Chapter 6. Using cross-validation to estimate effects of overfitting
  28. Chapter 6. Building models using many variables
  29. Chapter 6. Using nearest neighbor methods
  30. Chapter 6. Using Naive Bayes
  31. Chapter 6. Summary
  32. Chapter 7. Linear and logistic regression
  33. Chapter 7. Building a linear regression model
  34. Chapter 7. Finding relations and extracting advice
  35. Chapter 7. Reading the model summary and characterizing coefficient quality
  36. Chapter 7. Statistics as an attempt to correct bad experimental design
  37. Chapter 7. Using logistic regression
  38. Chapter 7. Building a logistic regression model
  39. Chapter 7. Finding relations and extracting advice from logistic models
  40. Chapter 7. Reading the model summary and characterizing coefficients
  41. Chapter 7. Null and residual deviances
  42. Chapter 7. Logistic regression takeaways
  43. Chapter 8. Unsupervised methods
  44. Chapter 8. Hierarchical clustering with hclust()
  45. Chapter 8. Picking the number of clusters
  46. Chapter 8. The k-means algorithm
  47. Chapter 8. Association rules
  48. Chapter 8. Mining association rules with the arules package
  49. Chapter 8. Association rule takeaways
  50. Chapter 9. Exploring advanced methods
  51. Chapter 9. Using bagging to improve prediction
  52. Chapter 9. Using random forests to further improve prediction
  53. Chapter 9. Using generalized additive models (GAMs) to learn non-monotone relationships
  54. Chapter 9. Extracting the nonlinear relationships
  55. Chapter 9. Using kernel methods to increase data separation
  56. Chapter 9. Using an explicit kernel on a problem
  57. Chapter 9. Using SVMs to model complicated decision boundaries
  58. Chapter 9. Trying an SVM on artificial example data
  59. Chapter 9. Support vector machine takeaways
  60. Chapter 10. Documentation and deployment
  61. Chapter 10. Using knitr to produce milestone documentation
  62. Chapter 10. Using knitr to document the buzz data
  63. Chapter 10. Using comments and version control for running documentation
  64. Chapter 10. Using version control to record history
  65. Chapter 10. Using version control to explore your project
  66. Chapter 10. Using version control to share work
  67. Chapter 10. Deploying models
  68. Chapter 11. Producing effective presentations
  69. Chapter 11. Summarizing the project’s goals
  70. Chapter 11. Presenting your model to end users
  71. Chapter 11. Presenting your work to other data scientists

Product information

  • Title: Practical Data Science with R video edition
  • Author(s): Nina Zumel, John Mount
  • Release date: March 2014
  • Publisher(s): Manning Publications
  • ISBN: None