Kernel density estimation

In order to explain KDE, let us generate some one-dimensional data and build some histograms. Histograms are a good way to understand the underlying probability distribution of the data.

We can generate histograms using the following code block for reference:

> data <- rnorm(1000, mean=25, sd=5)> data.1 <- rnorm(1000, mean=10, sd=2)> data <- c(data, data.1)> hist(data)> hist(data, plot = FALSE)$breaks [1]  0  5 10 15 20 25 30 35 40 45$counts[1]   8 489 531 130 361 324 134  22   1$density[1] 0.0008 0.0489 0.0531 0.0130 0.0361 0.0324 0.0134 0.0022 0.0001$mids[1]  2.5  7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5$xname[1] "data"$equidist[1] TRUEattr(,"class")[1] "histogram"

This code creates two artificial data-sets and combines them. ...

Get R Data Analysis Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.