Subscripts and Indices
The key thing about working effectively with dataframes is to become completely at ease with using subscripts (or indices, as some people call them). In R, subscripts appear in square brackets []. A dataframe is a two-dimensional object, comprising rows and columns. The rows are referred to by the first (left-hand) subscript, the columns by the second (right-hand) subscript. Thus
worms[3,5]
[1] 4.3
is the value of Soil.pH (the variable in column 5) in row 3. To extract a range of values (say the 14th to 19th rows) from worm density (the variable in the seventh column) we use the colon operator : to generate a series of subscripts (14, 15, 16, 17, 18 and 19):
worms[14:19,7]
[1] 0 6 8 4 5 1
To extract a group of rows and a group of columns, you need to generate a series of subscripts for both the row and column subscripts. Suppose we want Area and Slope (columns 2 and 3) from rows 1 to 5:
worms[1:5,2:3]
Area Slope
1 3.6 11
2 5.1 2
3 2.8 3
4 2.4 5
5 3.8 0
This next point is very important, and is hard to grasp without practice. To select all the entries in a row the syntax is ‘number comma blank’. Similarly, to select all the entries in a column the syntax is ‘blank comma number’. Thus, to select all the columns in row 3
worms[3,]
Field.Name Area Slope Vegetation Soil.pH Damp Worm.density
3 Nursery.Field 2.8 3 Grassland 4.3 FALSE 2
whereas to select all of the rows in column number 3 we enter
worms[,3] [1] 11 2 3 5 0 2 3 0 0 4 10 1 2 6 0 0 8 2 1 1 0 ...Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access