Text feature extraction

In this section, we will start to manually create some features in order to quantify our textual passwords. Let's first create a new column in the data DataFrame called length, which will represent the length of the password:

# 1. the length of the password# on the left of the equal sign, note we are defining a new column called 'length'. We want this column to hold the # length of the password. # on the right of the equal sign, we use the apply method of pandas Series/DFs. We will apply a function (len in this case)# to every element in the column 'text'data['length'] = data['text'].apply(len)# see our changes take effectdata.head()

Here is the output:

Text Length
0 7606374520 10
1 piontekendre 12
2

Get Hands-On Machine Learning for Cybersecurity now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.