Transformations
Sometimes, there will be some variables in your source data that aren’t quite right. This section explains how to change a variable in a data frame.
Reassigning Variables
One of the most convenient ways to redefine a variable in
a data frame is to use the assignment operator. For example, suppose
that you wanted to change the type of a variable in the dow30
data frame that we created above. When
read.csv
imported this data, it
interpreted the “Date” field as a character string and converted it to a
factor:
> class(dow30$Date)
[1] "factor"
Factors are fine for some things, but we could better represent
the date field as a Date
object.
(That would create a proper ordering on dates and allow us to extract
information from them.) Luckily, Yahoo! Finance prints dates in the
default date format for R, so we can just transform these values into
Date
objects using as.Date
(see the help file for as.Date
for more information). So let’s change
this variable within the data frame to use Date
objects:
> dow30$Date <- as.Date(dow30$Date) > class(dow30$Date) [1] "Date"
It’s also possible to make other changes to data frames. For example, suppose that we wanted to define a new midpoint variable that is the mean of the high and low price. We can add this variable with the same notation:
> dow30$mid <- (dow30$High + dow30$Low) / 2 > names(dow30) [1] "symbol" "Date" "Open" "High" "Low" [6] "Close" "Volume" "Adj.Close" "mid"
The Transform Function
A convenient function for changing variables ...
Get R in a Nutshell, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.