Learning Python for Forensics - Second Edition

Book description

Design, develop, and deploy innovative forensic solutions using Python

Key Features

  • Discover how to develop Python scripts for effective digital forensic analysis
  • Master the skills of parsing complex data structures with Python libraries
  • Solve forensic challenges through the development of practical Python scripts

Book Description

Digital forensics plays an integral role in solving complex cybercrimes and helping organizations make sense of cybersecurity incidents. This second edition of Learning Python for Forensics illustrates how Python can be used to support these digital investigations and permits the examiner to automate the parsing of forensic artifacts to spend more time examining actionable data.

The second edition of Learning Python for Forensics will illustrate how to develop Python scripts using an iterative design. Further, it demonstrates how to leverage the various built-in and community-sourced forensics scripts and libraries available for Python today. This book will help strengthen your analysis skills and efficiency as you creatively solve real-world problems through instruction-based tutorials.

By the end of this book, you will build a collection of Python scripts capable of investigating an array of forensic artifacts and master the skills of extracting metadata and parsing complex data structures into actionable reports. Most importantly, you will have developed a foundation upon which to build as you continue to learn Python and enhance your efficacy as an investigator.

What you will learn

  • Learn how to develop Python scripts to solve complex forensic problems
  • Build scripts using an iterative design
  • Design code to accommodate present and future hurdles
  • Leverage built-in and community-sourced libraries
  • Understand the best practices in forensic programming
  • Learn how to transform raw data into customized reports and visualizations
  • Create forensic frameworks to automate analysis of multiple forensic artifacts
  • Conduct effective and efficient investigations through programmatic processing

Who this book is for

If you are a forensics student, hobbyist, or professional seeking to increase your understanding in forensics through the use of a programming language, then Learning Python for Forensics is for you. You are not required to have previous experience in programming to learn and master the content within this book. This material, created by forensic professionals, was written with a unique perspective and understanding for examiners who wish to learn programming.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Learning Python for Forensics Second Edition
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the authors
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Now for Something Completely Different
    1. When to use Python
      1. Development life cycle
    2. Getting started
    3. The omnipresent print() function
    4. Standard data types
      1. Strings and Unicode
      2. Integers and floats
      3. Boolean and none
      4. Structured data types
        1. Lists
        2. Dictionaries
        3. Sets and tuples
    5. Data type conversions
    6. Files
    7. Variables
    8. Understanding scripting flow logic
      1. Conditionals
      2. Loops
        1. The for loop
        2. The while loop
    9. Functions
    10. Summary
  7. Python Fundamentals
    1. Advanced data types and functions
      1. Iterators
      2. datetime objects
    2. Libraries
      1. Installing third-party libraries
      2. Libraries in this book
      3. Python packages
    3. Classes and object-oriented programming
    4. Try and except
      1. The raise function
    5. Creating our first script – unix_converter.py
    6. User input
      1. Using the raw input method and the system module – user_input.py
      2. Understanding Argparse – argument_parser.py
    7. Forensic scripting best practices
    8. Developing our first forensic script – usb_lookup.py
      1. Understanding the main() function
      2. Interpreting the search_key() function
      3. Running our first forensic script
    9. Troubleshooting
    10. Challenge
    11. Summary
  8. Parsing Text Files
    1. Setup API
    2. Introducing our script
      1. Overview
    3. Our first iteration – setupapi_parser_v1.py
      1. Designing the main() function
      2. Crafting the parse_setupapi() function
      3. Developing the print_output() function
      4. Running the script
    4. Our second iteration – setupapi_parser_v2.py
      1. Improving the main() function
      2. Tuning the parse_setupapi() function
      3. Modifying the print_output() function
      4. Running the script
    5. Our final iteration – setupapi_parser.py
      1. Extending the main() function
      2. Adding to the parse_setup_api() function
      3. Creating the parse_device_info() function
      4. Forming the prep_usb_lookup() function
      5. Constructing the get_device_names() function
      6. Enhancing the print_output() function
      7. Running the script
    6. Challenge
    7. Summary
  9. Working with Serialized Data Structures
    1. Serialized data structures
    2. A simple Bitcoin web API
    3. Our first iteration – bitcoin_address_lookup.v1.py
      1. Exploring the main() function
      2. Understanding the get_address() function
      3. Working with the print_transactions() function
      4. The print_header() helper function
      5. The get_inputs() helper function
      6. Running the script
    4. Our second iteration – bitcoin_address_lookup.v2.py
      1. Modifying the main() function
      2. Improving the get_address() function
      3. Elaborating on the print_transactions() function
      4. Running the script
    5. Mastering our final iteration – bitcoin_address_lookup.py
      1. Enhancing the parse_transactions() function
      2. Developing the csv_writer() function
      3. Running the script
      4. Challenge
    6. Summary
  10. Databases in Python
    1. An overview of databases
      1. Using SQLite3
      2. Using SQL
    2. Designing our script
    3. Manually manipulating databases with Python – file_lister.py
      1. Building the main() function
      2. Initializing the database with the init_db() function
      3. Checking for custodians with the get_or_add_custodian() function
      4. Retrieving custodians with the get_custodian() function
      5. Understanding the ingest_directory() function
        1. Exploring the os.stat() method
      6. Developing the format_timestamp() helper function
      7. Configuring the write_output() function
      8. Designing the write_csv() function
      9. Composing the write_html() function
      10. Running the script
    4. Automating databases further – file_lister_peewee.py
      1. Peewee setup
      2. Jinja2 setup
      3. Updating the main() function
      4. Adjusting the init_db() function
      5. Modifying the get_or_add_custodian() function
      6. Improving the ingest_directory() function
      7. A closer look at the format_timestamp() function
      8. Converting the write_output() function
      9. Simplifying the write_csv() function
      10. Condensing the write_html() function
      11. Running our new and improved script
    5. Challenge
    6. Summary
  11. Extracting Artifacts from Binary Files
    1. UserAssist
      1. Understanding the ROT-13 substitution cipher – rot13.py
      2. Evaluating code with timeit
    2. Working with the yarp library
    3. Introducing the struct module
    4. Creating spreadsheets with the xlsxwriter module
      1. Adding data to a spreadsheet
      2. Building a table
      3. Creating charts with Python
    5. The UserAssist framework
      1. Developing our UserAssist logic processor – userassist_parser.py
        1. Evaluating the main() function
        2. Defining the create_dictionary() function
        3. Extracting data with the parse_values() function
        4. Processing strings with the get_name() function
      2. Writing Excel spreadsheets – xlsx_writer.py
        1. Controlling output with the excel_writer() function
        2. Summarizing data with the dashboard_writer() function
        3. Writing artifacts in the userassist_writer() function
        4. Defining the file_time() function
        5. Processing integers with the sort_by_count() function
        6. Processing datetime objects with the sort_by_date() function
      3. Writing generic spreadsheets – csv_writer.py
        1. Understanding the csv_writer() function
    6. Running the UserAssist framework
    7. Challenge
    8. Summary
  12. Fuzzy Hashing
    1. Background on hashing
      1. Hashing files in Python
      2. Hashing large files – hashing_example.py
    2. Creating fuzzy hashes
      1. Context Triggered Piecewise Hashing (CTPH)
      2. Implementing fuzzy_hasher.py
      3. Starting with the main() function
      4. Creating our fuzzy hashes
        1. Generating our rolling hash
        2. Preparing signature generation
      5. Providing the output
      6. Running fuzzy_hasher.py
    3. Using ssdeep in Python – ssdeep_python.py
      1. Revisiting the main() function
      2. Redesigning our output() function
      3. Running ssdeep_python.py
    4. Additional challenges
    5. References
    6. Summary
  13. The Media Age
    1. Creating frameworks in Python
    2. Introduction to EXIF metadata
      1. Introducing the Pillow module
    3. Introduction to ID3 metadata
      1. Introducing the Mutagen module
    4. Introduction to Office metadata
      1. Introducing the lxml module
    5. The Metadata_Parser framework overview
      1. Our main framework controller – metadata_parser.py
      2. Controlling our framework with the main() function
    6. Parsing EXIF metadata – exif_parser.py
      1. Understanding the exif_parser() function
      2. Developing the get_tags() function
      3. Adding the dms_to_decimal() function
    7. Parsing ID3 metdata – id3_parser.py
      1. Understanding the id3_parser() function
      2. Revisiting the get_tags() function
    8. Parsing Office metadata – office_parser.py
      1. Evaluating the office_parser() function
      2. The get_tags() function for the last time
    9. Moving on to our writers
      1. Writing spreadsheets – csv_writer.py
      2. Plotting GPS data with Google Earth – kml_writer.py
      3. Supporting our framework with processors
        1. Creating framework-wide utility functions – utility.py
    10. Framework summary
    11. Additional challenges
    12. Summary
  14. Uncovering Time
    1. About timestamps
      1. What's an epoch?
    2. Using a GUI
      1. Basics of TkInter objects
        1. Implementing the TkInter GUI
        2. Using frame objects
      2. Using classes in TkInter
    3. Developing the date decoder GUI – date_decoder.py
      1. The DateDecoder class setup and __init__() method
      2. Executing the run() method
      3. Implementing the build_input_frame() method
      4. Creating the build_output_frame() method
      5. Building the convert() method
      6. Defining the convert_unix_seconds() method
      7. Conversion using the convert_win_filetime_64() method
      8. Converting with the convert_chrome_time() method
      9. Designing the output method
      10. Running the script
    4. Additional challenges
    5. Summary
  15. Rapidly Triaging Systems
    1. Understanding the value of system information
      1. Querying OS-agnostic process information with psutil
      2. Using WMI
        1. What does the pywin32 module do?
    2. Rapidly triaging systems – pysysinfo.py
      1. Understanding the get_process_info() function
      2. Learning about the get_pid_details() function
      3. Extracting process connection properties with the read_proc_connections() function
      4. Obtaining more process information with the read_proc_files() function
      5. Extracting Windows system information with the wmi_info() function
      6. Writing our results with the csv_writer() function
    3. Executing pysysinfo.py
    4. Challenges
    5. Summary
  16. Parsing Outlook PST Containers
    1. The PST file format
    2. An introduction to libpff
      1. How to install libpff and pypff
    3. Exploring PSTs – pst_indexer.py
      1. An overview
      2. Developing the main() function
      3. Evaluating the make_path() helper function
      4. Iteration with the folder_traverse() function
      5. Identifying messages with the check_for_msgs() function
      6. Processing messages in the process_msg() function
      7. Summarizing data in the folder_report() function
      8. Understanding the word_stats() function
      9. Creating the word_report() function
      10. Building the sender_report() function
      11. Refining the heat map with the date_report() function
      12. Writing the html_report() function
      13. The HTML template
    4. Running the script
    5. Additional challenges
    6. Summary
  17. Recovering Transient Database Records
    1. SQLite WAL files
      1. WAL format and technical specifications
        1. The WAL header
        2. The WAL frame
      2. The WAL cell and varints
      3. Manipulating large objects in Python
    2. Regular expressions in Python
    3. TQDM – a simpler progress bar
    4. Parsing WAL files – wal_crawler.py
      1. Understanding the main() function
      2. Developing the frame_parser() function
      3. Processing cells with the cell_parser() function
      4. Writing the dict_helper() function
        1. The Python debugger – pdb
      5. Processing varints with the single_varint() function
      6. Processing varints with the multi_varint() function
      7. Converting serial types with the type_helper() function
      8. Writing output with the csv_writer() function
      9. Using regular expression in the regular_search() function
    5. Executing wal_crawler.py
    6. Challenge
    7. Summary
  18. Coming Full Circle
    1. Frameworks
      1. Building a framework to last
      2. Data standardization
      3. Forensic frameworks
    2. Colorama
    3. FIGlet
    4. Exploring the framework – framework.py
      1. Exploring the Framework object
        1. Understanding the Framework __init__() constructor
        2. Creating the Framework run() method
        3. Iterating through files with the Framework _list_files() method
        4. Developing the Framework _run_plugins() method
      2. Exploring the Plugin object
        1. Understanding the Plugin __init__() constructor
        2. Working with the Plugin run() method
        3. Handling output with the Plugin write() method
      3. Exploring the Writer object
        1. Understanding the Writer __init__() constructor
        2. Understanding the Writer run() method
      4. Our Final CSV writer – csv_writer.py
      5. The writer – xlsx_writer.py
      6. Changes made to plugins
      7. Executing the framework
    5. Additional challenges
      1. Summary
  19. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Learning Python for Forensics - Second Edition
  • Author(s): Preston Miller, Chapin Bryce
  • Release date: January 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789341690