How to do it...

In the following steps, we demonstrate how a classifier can detect malware based on an observed sequence of API calls.

  1. Our logs are in JSON format, so we begin by importing the JSON library.
import numpy as npimport osimport jsondirectories_with_labels = [("DA Logs Benign", 0), ("DA Logs Malware", 1)]
  1. Write a function to parse the JSON logs:
def get_API_class_method_type_from_log(log):    """Parses out API calls from behavioral logs."""    API_data_sequence = []    with open(log) as log_file:        json_log = json.load(log_file)        api_calls_array = "[" + json_log["api_calls"] + "]"
  1. We choose to extract the class, method, and type of the API call:
        api_calls = json.loads(api_calls_array)        for api_call in api_calls: data = api_call["class"] ...

Get Machine Learning for Cybersecurity Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.