November 2019
Intermediate to advanced
346 pages
9h 36m
English
The code for the following can be found on https://github.com/PacktPublishing/Machine-Learning-for-Cybersecurity-Cookbook/blob/master/Chapter02/Classifying%20Files%20by%20Type/File%20Type%20Classifier.ipynb. We build a classifier using this data to predict files as JavaScript, Python, or PowerShell:
import osfrom sklearn.feature_extraction.text import HashingVectorizer, TfidfTransformerfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_score, confusion_matrixfrom sklearn.pipeline import Pipeline javascript_path = "/path/to/JavascriptSamples/" ...