O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning R Programming

Book Description

Become an efficient data scientist with R

About This Book

  • Explore the R language from basic types and data structures to advanced topics
  • Learn how to tackle programming problems and explore both functional and object-oriented programming techniques
  • Learn how to address the core problems of programming in R and leverage the most popular packages for common tasks

Who This Book Is For

This is the perfect tutorial for anyone who is new to statistical programming and modeling. Anyone with basic programming and data processing skills can pick this book up to systematically learn the R programming language and crucial techniques.

What You Will Learn

  • Explore the basic functions in R and familiarize yourself with common data structures
  • Work with data in R using basic functions of statistics, data mining, data visualization, root solving, and optimization
  • Get acquainted with R’s evaluation model with environments and meta-programming techniques with symbol, call, formula, and expression
  • Get to grips with object-oriented programming in R: including the S3, S4, RC, and R6 systems
  • Access relational databases such as SQLite and non-relational databases such as MongoDB and Redis
  • Get to know high performance computing techniques such as parallel computing and Rcpp
  • Use web scraping techniques to extract information
  • Create RMarkdown, an interactive app with Shiny, DiagramR, interactive charts, ggvis, and more

In Detail

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset.

Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.

Style and approach

Developed to make learning easy and intuitive, this book comes packed with a wide variety of statistical and graphical techniques and a wealth of practical information for anyone looking to get started with this exciting and powerful language.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Learning R Programming
    1. Learning R Programming
    2. Credits
    3. About the Author
    4. About the Reviewer
    5. www.PacktPub.com
      1. Why subscribe?
    6. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    7. 1. Quick Start
      1. Introducing R
        1. R as a programming language
        2. R as a computing environment
        3. R as a community
        4. R as an ecosystem
      2. The need for R
      3. Installing R
      4. RStudio
        1. RStudio's user interface
          1. The console
          2. The editor
          3. The Environment pane
          4. The History pane
          5. The File pane
          6. The Plots pane
          7. The Packages pane
          8. The Help pane
          9. The Viewer pane
        2. RStudio Server
      5. A quick example
      6. Summary
    8. 2. Basic Objects
      1. Vector
        1. Numeric vector
        2. Logical vector
        3. Character vector
        4. Subsetting vectors
        5. Named vectors
        6. Extracting an element
        7. Telling the class of vectors
        8. Converting vectors
        9. Arithmetic operators for numeric vectors
      2. Matrix
        1. Creating a matrix
        2. Naming rows and columns
        3. Subsetting a matrix
        4. Using matrix operators
      3. Array
        1. Creating an array
        2. Subsetting an array
      4. Lists
        1. Creating a list
        2. Extracting an element from a list
        3. Subsetting a list
        4. Named lists
        5. Setting values
        6. Other functions
      5. Data frames
        1. Creating a data frame
        2. Naming rows and columns
        3. Subsetting a data frame
          1. Subsetting a data frame as a list
          2. Subsetting a data frame as a matrix
          3. Filtering data
        4. Setting values
          1. Setting values as a list
          2. Setting values as a matrix
        5. Factors
        6. Useful functions for data frames
        7. Loading and writing data on disk
      6. Functions
        1. Creating a function
        2. Calling a function
        3. Dynamic typing
        4. Generalizing a function
        5. Default value for function arguments
      7. Summary
    9. 3. Managing Your Workspace
      1. R's working directory
        1. Creating an R project in RStudio
        2. Comparing absolute and relative paths
        3. Managing project files
      2. Inspecting the environment
        1. Inspecting existing symbols
        2. Viewing the structure of an object
        3. Removing symbols
      3. Modifying global options
        1. Modifying the number of digits to print
        2. Modifying the warning level
      4. Managing the library of packages
        1. Getting to know a package
        2. Installing packages from CRAN
        3. Updating packages from CRAN
        4. Installing packages from online repositories
        5. Using package functions
        6. Masking and name conflicts
        7. Checking whether a package is installed
      5. Summary
    10. 4. Basic Expressions
      1. Assignment expressions
        1. Alternative assignment operators
        2. Using backticks with non-standard names
      2. Conditional expressions
        1. Using if as a statement
        2. Using if as an expression
        3. Using if with vectors
        4. Using vectorized if: ifelse
        5. Using switch to branch values
      3. Loop expressions
        1. Using the for loop
          1. Managing the flow of a for loop
          2. Creating nested for loops
        2. Using the while loop
      4. Summary
    11. 5. Working with Basic Objects
      1. Using object functions
        1. Testing object types
          1. Accessing object classes and types
        2. Accessing data dimensions
          1. Getting data dimensions
          2. Reshaping data structures
          3. Iterating over one dimension
      2. Using logical functions
        1. Logical operators
        2. Logical functions
          1. Aggregating logical vectors
          2. Asking which elements are TRUE
        3. Dealing with missing values
        4. Logical coercion
      3. Using math functions
        1. Basic functions
        2. Number rounding functions
        3. Trigonometric functions
        4. Hyperbolic functions
        5. Extreme functions
      4. Applying numeric methods
        1. Root finding
        2. Calculus
          1. Derivatives
          2. Integration
      5. Using statistical functions
        1. Sampling from a vector
        2. Working with random distributions
        3. Computing summary statistics
          1. Computing covariance and correlation matrix
      6. Using apply-family functions
        1. lapply
        2. sapply
        3. vapply
        4. mapply
        5. apply
      7. Summary
    12. 6. Working with Strings
      1. Getting started with strings
        1. Printing texts
        2. Concatenating strings
        3. Transforming texts
          1. Changing cases
          2. Counting characters
          3. Trimming leading and trailing whitespaces
          4. Substring
          5. Splitting texts
        4. Formatting texts
          1. Using Python string functions in R
      2. Formatting date/time
        1. Parsing text as date/time
        2. Formatting date/time to strings
      3. Using regular expressions
        1. Finding a string pattern
        2. Using groups to extract the data
        3. Reading data in customizable ways
      4. Summary
    13. 7. Working with Data
      1. Reading and writing data
        1. Reading and writing text-format data in a file
          1. Importing data via RStudio IDE
          2. Importing data using built-in functions
          3. Importing data using the readr package
          4. Writing a data frame to a file
        2. Reading and writing Excel worksheets
        3. Reading and writing native data files
          1. Reading and writing a single object in native format
          2. Saving and restoring the working environment
        4. Loading built-in datasets
      2. Visualizing data
        1. Creating scatter plots
          1. Customizing chart elements
          2. Customizing point styles
          3. Customizing point colors
        2. Creating line plots
          1. Customizing line type and width
          2. Plotting lines in multiple periods
          3. Plotting lines with points
          4. Plotting a multi-series chart with a legend
        3. Creating bar charts
        4. Creating pie charts
        5. Creating histogram and density plots
        6. Creating box plots
      3. Analyzing data
        1. Fitting a linear model
        2. Fitting a regression tree
      4. Summary
    14. 8. Inside R
      1. Understanding lazy evaluation
      2. Understanding the copy-on-modify mechanism
        1. Modifying objects outside a function
      3. Understanding lexical scoping
      4. Understanding how an environment works
        1. Knowing the environment object
        2. Creating and chaining environments
          1. Accessing an environment
        3. Chaining environments
          1. Using environments for reference semantics
          2. Knowing the built-in environments
        4. Understanding environments associated with a function
      5. Summary
    15. 9. Metaprogramming
      1. Understanding functional programming
        1. Creating and using closures
          1. Creating a simple closure
          2. Making specialized functions
          3. Fitting normal distribution with maximal likelihood estimation
        2. Using higher-order functions
          1. Creating aliases for functions
          2. Using functions as variables
          3. Passing functions as arguments
      2. Computing on language
        1. Capturing and modifying expressions
          1. Capturing expressions as language objects
          2. Modifying expressions
          3. Capturing expressions of function arguments
          4. Constructing function calls
        2. Evaluating expressions
        3. Understanding non-standard evaluation
          1. Implementing quick subsetting using non-standard evaluation
          2. Understanding dynamic scoping
          3. Using formulas to capture expression and environment
          4. Implementing subset with metaprogramming
      3. Summary
    16. 10. Object-Oriented Programming
      1. Introducing object-oriented programming
        1. Understanding classes and methods
        2. Understanding inheritance
      2. Working with the S3 object system
        1. Understanding generic functions and method dispatch
        2. Working with built-in classes and methods
        3. Defining generic functions for existing classes
        4. Creating objects of new classes
          1. Using list as the underlying data structure
          2. Using an atomic vector as the underlying data structure
          3. Understanding S3 inheritance
      3. Working with S4
        1. Defining S4 classes
        2. Understanding S4 inheritance
        3. Defining S4 generic functions
        4. Understanding multiple dispatch
      4. Working with the reference class
      5. Working with R6
      6. Summary
    17. 11. Working with Databases
      1. Working with relational databases
        1. Creating a SQLite database
          1. Writing multiple tables to a database
          2. Appending data to a table
        2. Accessing tables and table fields
        3. Learning SQL to query relational databases
        4. Fetching query results chunk by chunk
        5. Using transactions for consistency
        6. Storing data in files to a database
      2. Working with NoSQL databases
        1. Working with MongoDB
          1. Querying data from MongoDB
          2. Creating and removing indexes
        2. Using Redis
          1. Accessing Redis from R
          2. Setting and getting values from the Redis server
      3. Summary
    18. 12. Data Manipulation
      1. Using built-in functions to manipulate data frames
        1. Using built-in functions to manipulate data frames
        2. Reshaping data frames using reshape2
      2. Using SQL to query data frames via the sqldf package
      3. Using data.table to manipulate data
        1. Using key to access rows
        2. Summarizing data by groups
        3. Reshaping data.table
        4. Using in-place set functions
        5. Understanding dynamic scoping of data.table
      4. Using dplyr pipelines to manipulate data frames
      5. Using rlist to work with nested data structures
      6. Summary
    19. 13. High-Performance Computing
      1. Understanding code performance issues
        1. Measuring code performance
      2. Profiling code
        1. Profiling code with Rprof
        2. Profiling code with profvis
        3. Understanding why code can be slow
      3. Boosting code performance
        1. Using built-in functions
        2. Using vectorization
        3. Using byte-code compiler
        4. Using Intel MKL-powered R distribution
        5. Using parallel computing
          1. Using parallel computing on Windows
          2. Using parallel computing on Linux and MacOS
        6. Using Rcpp
          1. OpenMP
          2. RcppParallel
      4. Summary
    20. 14. Web Scraping
      1. Looking inside web pages
      2. Extracting data from web pages using CSS selectors
      3. Learning XPath selectors
      4. Analysing HTML code and extracting data
      5. Summary
    21. 15. Boosting Productivity
      1. Writing R Markdown documents
        1. Getting to know markdown
        2. Integrating R into Markdown
        3. Embedding tables and charts
          1. Embedding tables
          2. Embedding charts and diagrams
          3. Embedding interactive plots
      2. Creating interactive apps
        1. Creating a shiny app
        2. Using shinydashboard
      3. Summary