Predicting chemical biodegration

In this section, we are going to use R's e1071 package to try out the models we've discussed on a real-world data set. As our first example, we have chosen the QSAR biodegration data set, which can be found at https://archive.ics.uci.edu/ml/datasets/QSAR+biodegradation#. This is a data set containing 41 numerical variables that describe the molecular composition and properties of 1055 chemicals. The modeling task is to predict whether a particular chemical will be biodegradable based on these properties. Example properties are the percentages of carbon, nitrogen, and oxygen atoms as well as the number of heavy atoms in the molecule. These features are highly specialized and sufficiently numerous, so a full listing ...

Get Mastering Predictive Analytics with R now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.