Evaluating the model

Since we now have labeled data for January and February 2019, we can evaluate how the model performs each month. First, we read in the 2019 data from the database:

>>> with sqlite3.connect('logs/logs.db') as conn:...     logs_2019 = pd.read_sql(...         """...         SELECT * ...         FROM logs ...         WHERE datetime BETWEEN "2019-01-01" AND "2020-01-01";...         """, conn, parse_dates=['datetime'], index_col='datetime'...     )>>>     hackers_2019 = pd.read_sql(...         """...         SELECT * ...         FROM attacks ...         WHERE start BETWEEN "2019-01-01" AND "2020-01-01";...         """, conn, parse_dates=['start', 'end']...     ).assign(...         start_floor=lambda x: x.start.dt.floor('min'),...         end_ceil=lambda x: x.end.dt.ceil('min')...     )

Next, we isolate the January 2019 data:

>>> X_jan, y_jan ...

Get Hands-On Data Analysis with Pandas now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.