Regression Models with Spatially Correlated Errors: Generalized Least Squares
In Chapter 19 we looked at the use of linear mixed-effects models for dealing with random effects and temporal pseudoreplication. Here we illustrate the use of generalized least squares (GLS) for regression modelling where we would expect neighbouring values of the response variable to be correlated. The great advantage of the gls function is that the errors are allowed to be correlated and/or to have unequal variances. The gls function is part of the nlme package:
library(nlme)
The following example is a geographic-scale trial to compare the yields of 56 different varieties of wheat. What makes the analysis more challenging is that the farms carrying out the trial were spread out over a wide range of latitudes and longitudes.
spatialdata<-read.table("c:\\temp\\spatialdata.txt",header=T) attach(spatialdata) names(spatialdata) [1] "Block" "variety" "yield" "latitude" "longitude"
We begin with graphical data inspection to see the effect of location on yield:
par(mfrow=c(1,2)) plot(latitude,yield) plot(longitude,yield)
There are clearly big effects of latitude and longitude on both the mean yield and the variance in yield. The latitude effect looks like a threshold effect, with little impact for latitudes less than 30. The longitude effect looks more continuous but there is a hint of non-linearity (perhaps even a hump). The varieties differ substantially in their mean yields:
par(mfrow=c(1,1)) barplot(sort(tapply(yield,variety,mean))) ...
Get The R Book now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.