December 2018
Intermediate to advanced
318 pages
8h 28m
English
To make all of our hard work easier to use, we need to pack it all up into a single, neat function, as shown:
# remake a simple two char CVtwo_cv = CountVectorizer(ngram_range=(1, 2), analyzer='char', lowercase=False)two_char = two_cv.fit_transform(text)two_char# there are 7,528 unique 2-in-a-row-chars (number of columns)<1048485x7528 sparse matrix of type '<type 'numpy.int64'>' with 14350326 stored elements in Compressed Sparse Row format># make a simple function using the two_char CV and matrixdef get_closest_word_similarity(password): raw_vectorization = cosine_similarity(two_cv.transform([password]), two_char) return raw_vectorization[:,np.argsort(raw_vectorization)[0,-20:]].mean()
This function makes it easier ...
Read now
Unlock full access