Beyond Spreadsheets with R

Book description

With Beyond Spreadsheets with R you’ll learn how to go from raw data to meaningful insights using R and RStudio. Each carefully crafted chapter covers a unique way to wrangle data, from understanding individual values to interacting with complex collections of data, including data you scrape from the web. You’ll build on simple programming techniques like loops and conditionals to create your own custom functions. You’ll come away with a toolkit of strategies for analyzing and visualizing data of all sorts.

Table of contents

  1. Titlepage
  2. Copyright
  3. preface
  4. acknowledgments
  5. about this book
    1. Who needs this book?
    2. How to read this book
      1. Formatting
      2. Structure
    3. Getting started
    4. Where to find more help
    5. More about this book
    6. Book forum
  6. about the author
  7. about the cover illustration
  8. Chapter 1: Introducing data and the R language
    1. 1.1 Data: What, where, how?
      1. 1.1.1 What is data?
      2. 1.1.2 Seeing the world as data sources
      3. 1.1.3 Data munging
      4. 1.1.4 What you can do with well-handled data
      5. 1.1.5 Data as an asset
      6. 1.1.6 Reproducible research and version control
    2. 1.2 Introducing R
      1. 1.2.1 The origins of R
      2. 1.2.2 What R is and what it isn’t
    3. 1.3 How R works
    4. 1.4 Introducing RStudio
      1. 1.4.1 Working with R within RStudio
      2. 1.4.2 Built-in packages (data and functions)
      3. 1.4.3 Built-in documentation
      4. 1.4.4 Vignettes
    5. 1.5 Try it yourself
    6. Terminology
    7. Summary
  9. Chapter 2: Getting to know R data types
    1. 2.1 Types of data
      1. 2.1.1 Numbers
      2. 2.1.2 Text (strings)
      3. 2.1.3 Categories (factors)
      4. 2.1.4 Dates and times
      5. 2.1.5 Logicals
      6. 2.1.6 Missing values
    2. 2.2 Storing values (assigning)
      1. 2.2.1 Naming data (variables)
      2. 2.2.2 Unchanging data
      3. 2.2.3 The assignment operators (<- vs. =)
    3. 2.3 Specifying the data type
    4. 2.4 Telling R to ignore something
    5. 2.5 Try it yourself
    6. Terminology
    7. Summary
  10. Chapter 3: Making new data values
    1. 3.1 Basic mathematics
    2. 3.2 Operator precedence
    3. 3.3 String concatenation (joining)
    4. 3.4 Comparisons
    5. 3.5 Automatic conversion (coercion)
    6. 3.6 Try it yourself
    7. Terminology
    8. Summary
  11. Chapter 4: Understanding the tools you’ll use: Functions
    1. 4.1 Functions
      1. 4.1.1 Under the hood
      2. 4.1.2 Function template
      3. 4.1.3 Arguments
      4. 4.1.4 Multiple arguments
      5. 4.1.5 Default arguments
      6. 4.1.6 Argument name matching
      7. 4.1.7 Partial matching
      8. 4.1.8 Scope
    2. 4.2 Packages
      1. 4.2.1 Installing packages
      2. 4.2.2 How does R (not) know about this function?
      3. 4.2.3 Namespaces
    3. 4.3 Messages, warnings, and errors, oh my!
      1. 4.3.1 Creating messages, warnings, and errors
      2. 4.3.2 Diagnosing messages, warnings, and errors
    4. 4.4 Testing
    5. 4.5 Project: Generalizing a function
    6. 4.6 Try it yourself
    7. Terminology
    8. Summary
  12. Chapter 5: Combining data values
    1. 5.1 Simple collections
      1. 5.1.1 Coercion
      2. 5.1.2 Missing values
      3. 5.1.3 Attributes
      4. 5.1.4 Names
    2. 5.2 Sequences
      1. 5.2.1 Vector functions
      2. 5.2.2 Vector math operations
    3. 5.3 Matrices
      1. 5.3.1 Naming dimensions
    4. 5.4 Lists
    5. 5.5 data.frames
    6. 5.6 Classes
      1. 5.6.1 The tibble class
      2. 5.6.2 Structures as function arguments
    7. 5.7 Try it yourself
    8. Terminology
    9. Summary
  13. Chapter 6: Selecting data values
    1. 6.1 Text processing
      1. 6.1.1 Text matching
      2. 6.1.2 Substrings
      3. 6.1.3 Text substitutions
      4. 6.1.4 Regular expressions
    2. 6.2 Selecting components from structures
      1. 6.2.1 Vectors
      2. 6.2.2 Lists
      3. 6.2.3 Matrices
    3. 6.3 Replacing values
    4. 6.4 data.frames and dplyr
      1. 6.4.1 dplyr verbs
      2. 6.4.2 Non-standard evaluation
      3. 6.4.3 Pipes
      4. 6.4.4 Subsetting data.frame the hard way
    5. 6.5 Replacing NA
    6. 6.6 Selecting conditionally
    7. 6.7 Summarizing values
    8. 6.8 A worked example: Excel vs. R
    9. 6.9 Try it yourself
      1. 6.9.1 Solutions — no peeking
    10. Terminology
    11. Summary
  14. Chapter 7: Doing things with lots of data
    1. 7.1 Tidy data principles
      1. 7.1.1 The working directory
      2. 7.1.2 Stored data formats
      3. 7.1.3 Reading data into R
      4. 7.1.4 Scraping data
      5. 7.1.5 Inspecting data
      6. 7.1.6 Dealing with odd values in data (sentinel values)
      7. 7.1.7 Converting to tidy data
    2. 7.2 Merging data
    3. 7.3 Writing data from R
    4. 7.4 Try it yourself
    5. Terminology
    6. Summary
  15. Chapter 8: Doing things conditionally: Control structures
    1. 8.1 Looping
      1. 8.1.1 Vectorization
      2. 8.1.2 Tidy repetition: Looping with purrr
      3. 8.1.3 for loops
    2. 8.2 Wider and narrower loop scope
      1. 8.2.1 while loops
    3. 8.3 Conditional evaluation
      1. 8.3.1 if conditions
      2. 8.3.2 ifelse conditions
    4. 8.4 Try it yourself
    5. Terminology
    6. Summary
  16. Chapter 9: Visualizing data: Plotting
    1. 9.1 Data preparation
      1. 9.1.1 Tidy data, revisited
      2. 9.1.2 Importance of data types
    2. 9.2 ggplot2
      1. 9.2.1 General construction
      2. 9.2.2 Adding points
      3. 9.2.3 Style aesthetics
      4. 9.2.4 Adding lines
      5. 9.2.5 Adding bars
      6. 9.2.6 Other types of plots
      7. 9.2.7 Scales
      8. 9.2.8 Facetting
      9. 9.2.9 Additional options
    3. 9.3 Plots as objects
    4. 9.4 Saving plots
    5. 9.5 Try it yourself
    6. Terminology
    7. Summary
  17. Chapter 10: Doing more with your data with extensions
    1. 10.1 Writing your own packages
      1. 10.1.1 Creating a minimal package
      2. 10.1.2 Documentation
    2. 10.2 Analyzing your package
      1. 10.2.1 Unit testing
      2. 10.2.2 Profiling
    3. 10.3 What to do next?
      1. 10.3.1 Regression
      2. 10.3.2 Clustering
      3. 10.3.3 Working with maps
      4. 10.3.4 Interacting with APIs
      5. 10.3.5 Sharing your package
    4. 10.4 More resources
    5. Terminology
    6. Summary
  18. Appendix A: Installing R
    1. Windows
    2. Mac
    3. Linux
    4. From source
  19. Appendix B: Installing RStudio
    1. Installing RStudio
    2. Packages used in this book
  20. Appendix C: Graphics in base R
  21. Index
  22. List of Figures
  23. List of Tables
  24. List of Listings

Product information

  • Title: Beyond Spreadsheets with R
  • Author(s): Jonathan Carroll
  • Release date: January 2019
  • Publisher(s): Manning Publications
  • ISBN: 9781617294594