O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Scientific Computing with R

Book Description

Employ professional quantitative methods to answer scientific questions with a powerful open source data analysis environment

In Detail

With this book, you will learn not just about R, but how to use R to answer conceptual, scientific, and experimental questions.

Beginning with an overview of fundamental R concepts, you'll learn how R can be used to achieve the most commonly needed scientific data analysis tasks: testing for statistically significant differences between groups and model relationships in data. You will delve into linear algebra and matrix operations with an emphasis not on the R syntax, but on how these operations can be used to address common computational or analytical needs. This book also covers the application of matrix operations for the purpose of finding structure in high-dimensional data using the principal component, exploratory factor, and confirmatory factor analysis in addition to structural equation modeling. You will also master methods for simulation and learn about an advanced analytical method.

What You Will Learn

  • Master data management in R
  • Perform hypothesis tests using both parametric and nonparametric methods
  • Understand how to perform statistical modeling using linear methods
  • Model nonlinear relationships in data with kernel density methods
  • Use matrix operations to improve coding productivity
  • Utilize the observed data to model unobserved variables
  • Deal with missing data using multiple imputations
  • Simplify high-dimensional data using principal components, singular value decomposition, and factor analysis

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Mastering Scientific Computing with R
    1. Table of Contents
    2. Mastering Scientific Computing with R
    3. Credits
    4. About the Authors
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Downloading the color images of this book
        3. Errata
        4. Piracy
        5. Questions
    8. 1. Programming with R
      1. Data structures in R
        1. Atomic vectors
          1. Operations on vectors
        2. Lists
        3. Attributes
        4. Factors
        5. Multidimensional arrays
          1. Matrices
        6. Data frames
      2. Loading data into R
        1. Saving data frames
      3. Basic plots and the ggplot2 package
      4. Flow control
        1. The for() loop
          1. The apply() function
        2. The if() statement
        3. The while() loop
        4. The repeat{} and break statement
      5. Functions
      6. General programming and debugging tools
      7. Summary
    9. 2. Statistical Methods with R
      1. Descriptive statistics
        1. Data variability
          1. Confidence intervals
      2. Probability distributions
      3. Fitting distributions
        1. Higher order moments of a distribution
        2. Other statistical tests to fit distributions
          1. The propagate package
      4. Hypothesis testing
        1. Proportion tests
        2. Two sample hypothesis tests
        3. Unit root tests
      5. Summary
    10. 3. Linear Models
      1. An overview of statistical modeling
        1. Model formulas
        2. Explanatory variables interactions
        3. Error terms
        4. The intercept as parameter 1
          1. Updating a model
      2. Linear regression
        1. Plotting a slope
      3. Analysis of variance
      4. Generalized linear models
      5. Generalized additive models
      6. Linear discriminant analysis
      7. Principal component analysis
      8. Clustering
      9. Summary
    11. 4. Nonlinear Methods
      1. Nonparametric and parametric models
      2. The adsorption and body measures datasets
      3. Theory-driven nonlinear regression
      4. Visually exploring nonlinear relationships
      5. Extending the linear framework
        1. Polynomial regression
        2. Performing a polynomial regression in R
        3. Spline regression
      6. Nonparametric nonlinear methods
        1. Kernel regression
        2. Kernel weighted local polynomial fitting
          1. Optimal bandwidth selection
          2. A practical scientific application of kernel regression
        3. Locally weighted polynomial regression and the loess function
      7. Nonparametric methods with the np package
        1. Nonlinear quantile regression
      8. Summary
    12. 5. Linear Algebra
      1. Matrices and linear algebra
        1. Matrices in R
        2. Vectors in R
        3. Matrix notation
      2. The physical functioning dataset
      3. Basic matrix operations
        1. Element-wise matrix operations
          1. Matrix subtraction
          2. Matrix addition
          3. Matrix sweep
        2. Basic matrixwise operations
          1. Transposition
          2. Matrix multiplication
            1. Multiplying square matrices for social networks
            2. Outer products
            3. Using sparse matrices in matrix multiplication
          3. Matrix inversion
            1. Solving systems of linear equations
          4. Determinants
      4. Triangular matrices
      5. Matrix decomposition
        1. QR decomposition
        2. Eigenvalue decomposition
        3. Lower upper decomposition
        4. Cholesky decomposition
        5. Singular value decomposition
      6. Applications
        1. Rasch analysis using linear algebra and a paired comparisons matrix
        2. Calculating Cronbach's alpha
        3. Image compression using direct cosine transform
          1. Importing an image into R
          2. The compression technique
            1. Creating the transformation and quantization matrices
            2. Putting the matrices together for image compression
          3. DCT in R
      7. Summary
    13. 6. Principal Component Analysis and the Common Factor Model
      1. A primer on correlation and covariance structures
      2. Datasets used in this chapter
      3. Principal component analysis and total variance
        1. Understanding the basics of PCA
          1. How does PCA relate to SVD?
        2. Scaled versus unscaled PCA
        3. PCA for dimension reduction
          1. PCA to summarize wine properties
        4. Choosing the number of principal components to retain
      4. Formative constructs using PCA
      5. Exploratory factor analysis and reflective constructs
        1. Familiarizing yourself with the basic terms
        2. Matrices of interest
          1. Expressing factor analysis in a matrix model
        3. Basic EFA and concepts of covariance algebra
          1. Concepts of EFA estimation
            1. The centroid method
            2. Multiple actors
          2. Direct factor extraction by principal axis factoring
          3. Performing principal axis factoring in R
          4. Other factor extraction methods
          5. Factor rotation
            1. Orthogonal factor rotation methods
            2. Quartimax rotation
            3. Varimax rotation
            4. Oblique rotations
            5. Oblimin rotation
            6. Promax rotation
            7. Factor rotation in R
        4. Advanced EFA with the psych package
      6. Summary
    14. 7. Structural Equation Modeling and Confirmatory Factor Analysis
      1. Datasets
        1. Political democracy
        2. Physical functioning dataset
        3. Holzinger-Swineford 1939 dataset
      2. The basic ideas of SEM
        1. Components of an SEM model
        2. Path diagram
      3. Matrix representation of SEM
        1. The reticular action model (RAM)
          1. An example of SEM specification
        2. An example in R
      4. SEM model fitting and estimation methods
        1. Assessing SEM model fit
        2. Using OpenMx and matrix specification of an SEM
        3. Summarizing the OpenMx approach
        4. Explaining an entire example
        5. Specifying the model matrices
          1. Fitting the model
        6. Fitting SEM models using lavaan
        7. The lavaan syntax
      5. Comparing OpenMx to lavaan
        1. Explaining an example in lavaan
        2. Explaining an example in OpenMx
      6. Summary
    15. 8. Simulations
      1. Basic sample simulations in R
      2. Pseudorandom numbers
        1. The runif() function
        2. Bernoulli random variables
        3. Binomial random variables
        4. Poisson random variables
        5. Exponential random variables
      3. Monte Carlo simulations
        1. Central limit theorem
        2. Using the mc2d package
          1. One-dimensional Monte Carlo simulation
          2. Two-dimensional Monte Carlo simulation
          3. Additional mc2d functions
            1. The mcprobtree() function
            2. The cornode() function
            3. The mcmodel() function
            4. The evalmcmod() function
            5. Data visualization
          4. Multivariate nodes
      4. Monte Carlo integration
        1. Multiple integration
        2. Other density functions
      5. Rejection sampling
      6. Importance sampling
      7. Simulating physical systems
      8. Summary
    16. 9. Optimization
      1. One-dimensional optimization
        1. The golden section search method
        2. The optimize() function
        3. The Newton-Raphson method
        4. The Nelder-Mead simplex method
        5. More optim() features
      2. Linear programming
        1. Integer-restricted optimization
        2. Unrestricted variables
      3. Quadratic programming
      4. General non-linear optimization
      5. Other optimization packages
      6. Summary
    17. 10. Advanced Data Management
      1. Cleaning datasets in R
      2. String processing and pattern matching
        1. Regular expressions
      3. Floating point operations and numerical data types
      4. Memory management in R
        1. Basic R memory commands
        2. Handling R objects in memory
      5. Missing data
        1. Computational aspects of missing data in R
        2. Statistical considerations of missing data
        3. Deletion methods
          1. Listwise deletion or complete case analysis
          2. Pairwise deletion
        4. Visualizing missing data
        5. An overview of multiple imputation
          1. Imputation basic principles
          2. Approaches to imputation
      6. The Amelia package
        1. Getting estimates from multiply imputed datasets
          1. Extracting the mean
          2. Extracting the standard error of the mean
      7. The mice package
        1. Imputation functions in mice
      8. Summary
    18. Index