November 2019
Intermediate to advanced
346 pages
9h 36m
English
For machine learning to work for vulnerability detection, you need to find representations of the software programs that are amenable to learning. For this purpose, we use code gadgets, which are transformed into vectors. A code gadget is a selection of lines of code that are semantically related to each other. In step 1, we collect such code gadgets for training. You can see an image of three code gadgets, along with labels. Here, a label of 1 indicates a vulnerability, while a label of 0 indicates no vulnerability. To extract gadgets from the desired program, it is advised to use the commercial product Checkmarx to extract program slices, and then assemble them into code gadgets. Another dataset is available. That dataset, ...