book

R in a Nutshell

by Joseph Adler

January 2010

Beginner

634 pages

19h 50m

English

O'Reilly Media, Inc.

Read now

Unlock full access

WindowsMac OS XLinux and Unix SystemsInstallation using package management systemsInstalling R from downloaded files
WindowsMac OS XLinux and Unix
Command-Line Editing
Loading Packages on Windows and LinuxLoading Packages on Mac OS X
Exploring Packages on the WebFinding and Installing Packages Inside RWindows and Linux GUIsMac OS X GUIR consoleInstalling from the command line
Creating a Package DirectoryBuilding the Package
NAInf and -InfNaNNULL
Numeric VectorsCharacter VectorsSymbols
Order of OperationsAssignments
Separating ExpressionsParenthesesCurly Braces
Conditional StatementsLoops
Data Structure OperatorsIndexing by Integer VectorIndexing by Logical VectorIndexing by Name
MatricesArraysFactorsData FramesFormulasTime SeriesShinglesDates and TimesConnections
Class
Working with the Call StackEvaluating Functions in Different EnvironmentsAdding Objects to an Environment
Signaling ErrorsCatching Errors
Anonymous FunctionsProperties of Functions
Changes to Other EnvironmentsInput/OutputGraphics
Key IdeasImplementation Example
Defining ClassesNew ObjectsAccessing SlotsWorking with ObjectsCreating Coercion MethodsMethodsManaging MethodsBasic ClassesMore Help
S3 ClassesS3 MethodsUsing S3 Classes in S4 ClassesFinding Hidden S3 Methods
Monitoring Memory UsageIncreasing Memory LimitsCleaning Up Objects
Revolution RBuilding Your OwnBuilding on Microsoft WindowsBuilding R on Unix-like systemsBuilding R on Mac OS X
Entering Data Using R CommandsUsing the Edit GUIWindows Data EditorMac OS X Data EditorX Windows (Linux) Data Editor
Saving Objects with save
Text FilesDelimited filesFixed-width filesOther functions to parse dataOther Software
Export Then ImportDatabase Connection PackagesRODBCGetting RODBC workingInstalling the RODBC packageInstalling ODBC driversExample: SQLite ODBC on Mac OS XExample: SQLite ODBC on WindowsUsing RODBCOpening a channelGetting information about the databaseGetting dataClosing a channelDBIOpening a connectionGetting DB informationQuerying the databaseCleaning upTSDBI
Pasting Together Data StructuresPasterbind and cbindAn extended exampleMerging Data by Common Fields
Reassigning VariablesThe Transform FunctionApplying a Function to Each Element of an ObjectApplying a function to an arrayApplying a function to a list or vector
ShinglesCutCombining Objects with a Grouping Variable
Bracket Notationsubset FunctionRandom Sampling
tapply, aggregateAggregating Tables with rowsumCounting ValuesReshaping DataTransposing matrices and data framesReshaping data frames and matrices
Scatter PlotsPlotting Time SeriesBar ChartsPie ChartsPlotting Categorical DataThree-Dimensional DataPlotting DistributionsBox Plots
Common Arguments to Chart FunctionsGraphical ParametersAnnotationMarginsMultiple plotsText propertiesText sizeTypefaceAlignment and spacingRotationLine propertiesColorsAxesPointsGraphical parameter by nameBasic Graphics Functionspointslinescurvetextablinepolygonsegmentslegendtitleaxisboxmtexttrans3d
How Lattice WorksA Simple ExampleUsing Lattice FunctionsCustom Panel Functions
Univariate Trellis PlotsBar chartsDot plotsHistogramsDensity plotsStrip plotsUnivariate quantile-quantile plotsBivariate Trellis PlotsScatter plotsBox plots in latticeScatter plots matricesBivariate quantile-quantile plotsTrivariate PlotsLevel plotsContour plotsCloud plotsWire-frame plotsOther Plots
Common Arguments to Lattice Functionstrellis.skeletonControlling How Axes Are DrawnParametersplot.trellisstrip.defaultsimpleKey
Low-Level Graphics FunctionsPanel Functions
Normal Distribution-Based TestsComparing meansComparing paired dataComparing variances of two populationsComparing means across more than two groupsPairwise t-tests between multiple groupsTesting for normalityTesting if a data vector came from an arbitrary distributionTesting if two data vectors came from the same distributionCorrelation testsNon-Parametric TestsComparing two meansComparing more than two meansComparing variancesDifference in scale parameters
Proportion TestsBinomial TestsTabular Data TestsNon-Parametric Tabular Data Tests
Fitting a ModelHelper Functions for Specifying the ModelGetting Information About a ModelViewing the modelPredicting values using a modelAnalyzing the fitRefining the Model
Assumptions of Least Squares RegressionRobust and Resistant RegressionResistant regressionRobust regressionComparing lm, lqs, and rlm
Stepwise Variable SelectionRidge RegressionLasso and Least Angle RegressionPrincipal Components Regression and Partial Least Squares Regression
Generalized Linear ModelsNonlinear Least Squares
SplinesFitting Polynomial SurfacesKernel Smoothing
Regression Tree ModelsRecursive partitioning treesPatient rule induction methodBagging for regressionBoosting for regressionRandom forests for regressionMARSNeural NetworksProject Pursuit RegressionGeneralized Additive ModelsSupport Vector Machines
Logistic RegressionLinear Discriminant AnalysisLog-Linear Models
k Nearest NeighborsClassification Tree ModelsBaggingBoostingNeural NetworksSVMsRandom Forests
Distance MeasuresClustering Algorithms
Loading Raw Expression DataLoading Data from GEOMatching Phenotype DataAnalyzing Expression Data
eSetAssayDataAnnotatedDataFrameMIAMEOther Classes Used by Bioconductor Packages
Resources Outside BioconductorVignettesCoursesBooks
FunctionsData Sets
FunctionsData Sets
Functions
FunctionsData Sets
Functions
FunctionsData Sets
Functions
Functions
FunctionsData Sets
FunctionsData Sets
Functions
Functions
FunctionsData Sets
Functions
Functions
FunctionsData Set
Functions
FunctionsData Sets
FunctionsData Sets
Functions

Content preview from R in a Nutshell

Discrete Data

There is a different set of tests for looking at the statistical significance of discrete random variables (like counts of proportions), and so there is a different set of functions in R for performing those tests.

Proportion Tests

If you have a data set with several different groups of observations and are measuring the probability of success in each group (or the fraction of some other characteristic), you can use the function prop.test to measure whether the difference between groups is statistically significant. Specifically, prop.test can be used for testing the null hypothesis that the proportions (probabilities of success) in several groups are the same or that they equal certain given values:

prop.test(x, n, p = NULL,
          alternative = c("two.sided", "less", "greater"),
          conf.level = 0.95, correct = TRUE)

As an example, let’s revisit the field goal data. Above, we considered the question “is there a difference in the length of attempts indoors and outdoors?” Now, we’ll ask the question “is the probability of success the same indoors as it is outdoors?”

First, let’s create a new data set containing only good and bad field goals. (We’ll eliminate blocked and aborted attempts; there were only 8 aborted attempts and 24 blocked attempts in 2005, but 787 good attempts and 163 bad (no good) attempts.)

> field.goals.goodbad <- field.goals[field.goals$play.type=="FG good" |
                                     field.goals$play.type=="FG no", ]

Now, let’s create a table of successes and failures by stadium type:

> field.goals.table ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781449377502Supplemental Content Errata Page

R in a Nutshell

by Joseph Adler

Discrete Data

Proportion Tests

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

R in a Nutshell, 2nd Edition

The Big R-Book

R Packages

CRAN Recipes: DPLYR, Stringr, Lubridate, and RegEx in R

Publisher Resources