Looking for runs of numbers within vectors
The function is called rle, which stands for ‘run length encoding’ and is most easily understood with an example. Here is a vector of 150 random numbers from a Poisson distribution with mean 0.7:
(poisson<-rpois(150,0.7))
[1] 1 1 0 0 2 1 0 1 0 1 0 0 0 0 2 1 0 0 3 1
0 0 1 0 2 0 1 1 0 0 0 1 0 0 0 2 1
[38] 0 0 0 1 0 0 0 2 0 0 0 1 1 0 2 1 0 0 0 2
0 0 2 3 2 1 0 2 0 0 0 0 0 1 1 0 0
[75] 0 0 0 1 1 1 0 0 1 0 1 2 2 0 0 2 0 0 0 0
[112] 0 0 2 0 0 1 0 1 0 4 0 0 1 0 2 1 0 1 1 0
0 1 3 3 0 0 1 1 0 1 0 0 0 0 0 1 0
[149] 2 0
We can do our own run length encoding on the vector by eye: there is a run of two 1s, then a run of two 0s, then a single 2, then a single 1, then a single 0, and so on. So the run lengths are 2, 2, 1, 1, 1, 1, . . . . The values associated with these runs were 1, 0, 2, 1, 0, 1,. Here is the output from rle:
rle(poisson) Run Length Encoding lengths: int [1:93] 2 2 1 1 1 1 1 1 4 1... values : num [1:93] 1 0 2 1 0 1 0 1 0 2...
The object produced by rle is a list of two vectors: the lengths and the values. To find the longest run, and the value associated with that longest run, we use the indexed lists like this:
max(rle(poisson)[[1]])
[1] 7
So the longest run in this vector of numbers was 7. But 7 of what? We use which to find the location of the 7 in lengths, then apply this index to values to find the answer:
which(rle(poisson)[[1]]==7) [1] 55 rle(poisson)[[2]][55] [1] 0
So, not surprisingly given that the mean was just 0.7, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access