O'Reilly logo

Practical Data Science Cookbook - Second Edition by Abhijit Dasgupta, Benjamin Bengfort, Sean Patrick Murphy, Tony Ojeda, Prabhanjan Tattar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How to do it...

We get an initial understanding of the iris data set with simple functions. In this dataset, we have five variables Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species. We have three types of iris species setosa, versicolor, and virginica that need to be identified by length and width of sepals and petals:

  1. Load the iris object from the datasets package and get initial insight using the str, summary, and pairs functions:
data (iris) str (iris) ## 'data.frame': 150 obs. of 5 variables: ## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... ## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... ## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... ## $ Petal.Width : num 0.2 0.2 ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required