Piecewise Regression
This kind of regression fits different functions over different ranges of the explanatory variable. For example, it might fit different linear regressions to the left- and right-hand halves of a scatterplot. Two important questions arise in piecewise regression:
- how many segments to divide the line into;
- where to position the break points on the x axis.
Suppose we want to do the simplest piecewise regression, using just two linear segments. Where do we break up the x values? A simple, pragmatic view is to divide the x values at the point where the piecewise regression best fits the response variable. Let's take an example using a linear model where the response is the log of a count (the number of species recorded) and the explanatory variable is the log of the size of the area searched for the species:
data<-read.table("c:\\temp\\sasilwood.txt",header=T) attach(data) names(data) [1] "Species" "Area"
A quick scatterplot suggests that the relationship between log(Species) and log (Area) is not linear:
plot(log(Species)~log(Area),pch=16)
The slope appears to be shallower at small scales than at large. The overall regression highlights this at the model-checking stage:
model1<-lm(log(Species)~log(Area)) plot(log(Area),resid(model1))
The residuals are very strongly U-shaped (this plot should look like the sky at night).
If we are to use piecewise regression, then we need to work out how many straight-line segments to use and where to put the breaks. Visual ...
Get The R Book now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.