Transformations
Sometimes, there will be some variables in your source data that aren’t quite right. This section explains how to change a variable in a data frame.
Reassigning Variables
One of the most convenient ways to redefine a variable
in a data frame is to use the assignment operator. For example,
suppose that you wanted to change the type of a variable in the
dow30
data frame that we created
above. When read.csv
imported
this data, it interpreted the “Date” field as a character string and
converted it to a factor:
> class(dow30$Date) [1] "factor"
Factors are fine for some things, but we could better
represent the date field as a Date
object. (That would create a proper
ordering on dates and allow us to extract information from them.)
Luckily, Yahoo! Finance prints dates in the default date format for
R, so we can just transform these values into Date
objects using as.Date
(see the
help file for as.Date
for more
information). So, let’s change this variable within the data frame
to use Date
objects:
> dow30$Date <- as.Date(dow30$Date) > class(dow30$Date) [1] "Date"
It’s also possible to make other changes to data frames. For example, suppose that we wanted to define a new midpoint variable that is the mean of the high and low price. We can add this variable with the same notation:
> dow30$mid <- (dow30$High + dow30$Low) / 2 > names(dow30) [1] "symbol" "Date" "Open" "High" "Low" [6] "Close" "Volume" "Adj.Close" "mid"
The Transform Function
A convenient function for changing variables ...
Get R in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.