Loading data

In Rattle, you have to explicitly declare the role of each variable. A variable can have five different roles:

  • Input: The prediction process will use input variables to predict the value of the target variable.
  • Target: The target variable is the output of our model.
  • Risk: The risk variable is a measure of the target variable.
  • Ident or Identifier: An identifier is a variable that identifies a unique occurrence of an object. In our preceding example, the variable Person is an identifier that identifies a unique person.
  • Ignore: A variable marked Ignore will be ignored by the model. We'll come back to this role later-some variables can create noise and decrease the performance of your predictive model.

Rattle can load data from many data sources. ...

Get Predictive Analytics Using Rattle and Qlik Sense now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.