## Book description

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals

Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone, Second Edition, is the solution.

Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks.

Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, manipulation, and visualization; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. After all this you'll make your code reproducible with LaTeX, RMarkdown, and Shiny.

By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most.

Coverage includes

• Explore R, RStudio, and R packages
• Use R for math: variable types, vectors, calling functions, and more

• Exploit data structures, including data.frames, matrices, and lists

• Read many different types of data

• Create attractive, intuitive statistical graphics

• Write user-defined functions

• Control program flow with if, ifelse, and complex checks

• Improve program efficiency with group manipulations

• Combine and reshape multiple datasets

• Manipulate strings using R's facilities and regular expressions

• Create normal, binomial, and Poisson probability distributions

• Build linear, generalized linear, and nonlinear models

• Program basic statistics: mean, standard deviation, and t-tests

• Train machine learning models

• Assess the quality of models and variable selection

• Prevent overfitting and perform variable selection, using the Elastic Net and Bayesian methods

• Analyze univariate and multivariate time series data

• Group data via K-means and hierarchical clustering

• Prepare reports, slideshows, and web pages with knitr

• Display interactive data with RMarkdown and htmlwidgets

• Implement dashboards with Shiny

• Build reusable R packages with devtools and Rcpp

2. Title Page
4. Dedication Page
5. Contents
6. Foreword
7. Preface
8. Acknowledgments
10. 1. Getting R
2. 1.2 R Version
3. 1.3 32-bit vs. 64-bit
4. 1.4 Installing
5. 1.5 Microsoft R Open
6. 1.6 Conclusion
11. 2. The R Environment
1. 2.1 Command Line Interface
2. 2.2 RStudio
3. 2.3 Microsoft Visual Studio
4. 2.4 Conclusion
12. 3. R Packages
1. 3.1 Installing Packages
3. 3.3 Building a Package
4. 3.4 Conclusion
13. 4. Basics of R
1. 4.1 Basic Math
2. 4.2 Variables
3. 4.3 Data Types
4. 4.4 Vectors
5. 4.5 Calling Functions
6. 4.6 Function Documentation
7. 4.7 Missing Data
8. 4.8 Pipes
9. 4.9 Conclusion
15. 6. Reading Data into R
2. 6.2 Excel Data
4. 6.4 Data from Other Statistical Tools
5. 6.5 R Binary Files
6. 6.6 Data Included with R
7. 6.7 Extract Data from Web Sites
9. 6.9 Conclusion
16. 7. Statistical Graphics
1. 7.1 Base Graphics
2. 7.2 ggplot2
3. 7.3 Conclusion
17. 8. Writing R functions
1. 8.1 Hello, World!
2. 8.2 Function Arguments
3. 8.3 Return Values
4. 8.4 do.call
5. 8.5 Conclusion
18. 9. Control Statements
19. 10. Loops, the Un-R Way to Iterate
20. 11. Group Manipulation
1. 11.1 Apply Family
2. 11.2 aggregate
3. 11.3 plyr
4. 11.4 data.table
5. 11.5 Conclusion
21. 12. Faster Group Manipulation with dplyr
22. 13. Iterating with purrr
1. 13.1 map
2. 13.2 map with Specified Types
3. 13.3 Iterating over a data.frame
4. 13.4 map with Multiple Inputs
5. 13.5 Conclusion
23. 14. Data Reshaping
1. 14.1 cbind and rbind
2. 14.2 Joins
3. 14.3 reshape2
4. 14.4 Conclusion
24. 15. Reshaping Data in the Tidyverse
25. 16. Manipulating Strings
26. 17. Probability Distributions
27. 18. Basic Statistics
1. 18.1 Summary Statistics
2. 18.2 Correlation and Covariance
3. 18.3 T-Tests
4. 18.4 ANOVA
5. 18.5 Conclusion
28. 19. Linear Models
1. 19.1 Simple Linear Regression
2. 19.2 Multiple Regression
3. 19.3 Conclusion
29. 20. Generalized Linear Models
30. 21. Model Diagnostics
31. 22. Regularization and Shrinkage
32. 23. Nonlinear Models
33. 24. Time Series and Autocorrelation
34. 25. Clustering
35. 26. Model Fitting with Caret
1. 26.1 Caret Basics
2. 26.2 Caret Options
3. 26.3 Tuning a Boosted Tree
4. 26.4 Conclusion
36. 27. Reproducibility and Reports with knitr
37. 28. Rich Documents with RMarkdown
1. 28.1 Document Compilation
3. 28.3 Markdown Primer
4. 28.4 Markdown Code Chunks
5. 28.5 htmlwidgets
6. 28.6 RMarkdown Slideshows
7. 28.7 Conclusion
38. 29. Interactive Dashboards with Shiny
39. 30. Building R Packages
1. 30.1 Folder Structure
2. 30.2 Package Files
3. 30.3 Package Documentation
4. 30.4 Tests
5. 30.5 Checking, Building and Installing
6. 30.6 Submitting to CRAN
7. 30.7 C++ Code
8. 30.8 Conclusion
40. A. Real-Life Resources
41. B. Glossary
42. List of Figures
43. List of Tables
44. General Index
45. Index of Functions
46. Index of Packages
47. Index of People
48. Data Index
49. Code Snippets

## Product information

• Title: R for Everyone: Advanced Analytics and Graphics, 2nd Edition
• Author(s): Jared P. Lander
• Release date: June 2017