In this section, we will start to manually create some features in order to quantify our textual passwords. Let's first create a new column in the data DataFrame called length, which will represent the length of the password:
# 1. the length of the password# on the left of the equal sign, note we are defining a new column called 'length'. We want this column to hold the # length of the password. # on the right of the equal sign, we use the apply method of pandas Series/DFs. We will apply a function (len in this case)# to every element in the column 'text'data['length'] = data['text'].apply(len)# see our changes take effectdata.head()
Here is the output:
Text | Length | |
---|---|---|
0 | 7606374520 | 10 |
1 | piontekendre | 12 |
2 |