August 2019
Intermediate to advanced
342 pages
9h 35m
English
In the following example, we will see the K-Means clustering algorithm applied to our previously created dataset of artifacts.
Remember that our dataset of artifacts contains the fields extracted from the PE file format of the individual samples, consisting of the .exe files previously stored, including both the legitimate and the suspect files.
The number of clusters that we will assign to the k parameter in the algorithm initialization phase will therefore be 2, while the features that we will select as distinctive criteria of the possible malware correspond to the MajorLinkerVersion, MajorImageVersion, MajorOperatingSystemVersion, and DllCharacteristics fields:
import numpy as np import pandas as pd import ...
Read now
Unlock full access