January 2024
Intermediate to advanced
544 pages
15h 21m
English
When I get my hands on a new dataset, the first thing I do is search it for any juicy, easy-to-find revelations. Depending on the dataset, I might look for politicians, organizations, or the city where I live. In the previous chapter, you learned to search text files like CSV or JSON files using grep, but grep won’t work on binary files like PDFs or Office documents. In this chapter, you’ll expand your search capabilities with Aleph, an open source investigation tool.
Aleph is developed by the Organized Crime and Corruption Reporting Project, a group of investigative journalists largely based in eastern Europe and central Asia. The tool allows you to index datasets, extracting all the text ...
Read now
Unlock full access