The logistic model applied to the iris dataset

We are going to begin with the simplest possible classification problem: two classes, setosa and versicolor, and just one independent variable or feature, the sepal_length. As it is usually done, we are going to encode the setosa and versicolor categorical variables with the numbers 0 and 1. Using pandas, we can do the following:

df = iris.query("species == ('setosa', 'versicolor')")y_0 = pd.Categorical(df['species']).codesx_n = 'sepal_length' x_0 = df[x_n].valuesx_c = x_0 - x_0.mean()

As with other linear models, centering the data can help with the sampling. Now that we have the data in the proper format, we can finally build the model with PyMC3.

Notice how the first part of model_0 resembles ...

Get Bayesian Analysis with Python - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.