6A Comparison of Machine Learning and Deep Learning Models with Advanced Word Embeddings: The Case of Internal Audit Reports

Gustavo FLEURY SOARES1 and Induraj PUDHUPATTU RAMAMURTHY2

1 Brazilian Office of the Comptroller General (CGU), Brazil

2 CYTech, Cergy, France, and University of India, India

When conducting an audit, the ability to make use of all the available information relating to the audit universe or subject could improve the quality of results. Classifying text documents in the audit (unstructured data) could enable the use of additional information to improve existing structured data, leading to better knowledge to support the audit process. To provide better automated support for knowledge discovery, natural language processing (NLP) could be applied. This chapter compares the results of classical machine learning and deep learning algorithms, combined with advanced word embeddings in order to classify the findings of internal audit reports.

6.1. Introduction

Internal Audits, as defined by the IIA (2012), are “an independent, objective assurance and consulting activity designed to add value and improve an organization’s operations”. To achieve this, the internal auditor endeavors to collect all available information relating to the organization – internal or external. This can be in the form of either structured or unstructured data, which the auditor analyzes in order to provide useful insights. Considering the volume of data created nowadays, it is crucial for ...

Get Optimization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.