Python for Excel

Book description

While Excel remains ubiquitous in the business world, recent Microsoft feedback forums are full of requests to include Python as an Excel scripting language-in fact, it's the top feature requested. What makes this combination so compelling? In this hands-on guide, Felix Zumstein- creator of xlwings, a popular open source package that allows the automation of Excel with Python-shows experienced Excel users how to integrate these two worlds efficiently.

Excel has added quite a few new capabilities over the past couple of years, but its automation language, VBA, stopped evolving a long time ago. Many Excel power users have already adopted Python for their daily automation tasks. This guide gets you started.

  • Use Python without previous programming knowledge
  • Get started with modern tools, including Jupyter notebooks and Visual Studio Code
  • Use pandas to acquire, clean, and analyze data and replace typical Excel calculations
  • Automate tedious tasks like consolidation of Excel workbooks and production of Excel reports
  • Use xlwings to build interactive Excel tools that use Python as a calculation engine
  • Connect Excel to databases and CSV files and fetch data from the internet using Python code
  • Write unit tests in Python to build solid spreadsheets

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Why I Wrote This Book
    2. Who This Book Is For
    3. How This Book Is Organized
      1. Manipulating Workbooks vs. Programming Excel
    4. Conventions Used in This Book
    5. Using Code Examples
    6. O’Reilly Online Learning
    7. How to Contact Us
  2. 1. Why Python for Excel?
    1. Introduction
      1. How it All Began
      2. Excel Is a Programming Language
      3. Excel in the News
    2. Programming Best Practices
      1. Separation of Concerns
      2. Readability
      3. DRY Principle
      4. Testing
      5. Version Control
    3. Modern Excel
      1. Power Query and Power Pivot
      2. Power BI
    4. Python for Excel
      1. Readability
      2. Standard Library
      3. Package Manager
      4. Scientific Computing
      5. Modern Language Features
      6. Cross-Platform Compatibility
    5. Conclusion
  3. 2. Development Environment
    1. Command Line
      1. File Extensions
      2. Running Commands
    2. The Anaconda Python Distribution
      1. Installation
      2. Anaconda Prompt
      3. Python REPL: An Interactive Python Session
      4. Package Managers: Conda and pip
      5. Conda Environments
    3. Jupyter Notebooks
      1. A First Notebook
      2. Notebook Cells
      3. Edit vs. Command Mode
      4. Magic Commands
      5. Run Order Matters
      6. Shutting Down a Notebook
      7. Jupyter CLI
      8. Cloud Solutions
    4. Visual Studio Code
      1. Installation
      2. Configuration
      3. A First Script
      4. Run the Script
      5. Debugging
      6. Jupyter Notebooks in VS Code
    5. Conclusion
  4. 3. Introduction to Python
    1. Python Preliminaries
      1. Object-Oriented Programming (OOP)
      2. Comments
      3. Variables
      4. Line Continuation
      5. None
      6. Functions (Preview)
    2. Data Types
      1. Numeric Types
      2. Booleans
      3. Strings
    3. Indexing and Slicing
      1. Indexing
      2. Slicing
      3. Chaining
    4. Data Structures
      1. Lists
      2. Dictionaries
      3. Tuples
      4. Sets
      5. Mutable vs. Immutable Objects
    5. Control Flow
      1. Code Blocks and the pass Statement
      2. If Statement and Conditional Expressions
      3. For Loop
      4. While Loop
      5. List, Dictionary and Set Comprehensions
    6. Code Organization
      1. Functions
      2. Modules and the Import Statement
      3. Classes and Objects
      4. Date and Time
    7. Error Handling and File Manipulation
      1. Exceptions
      2. Files and the with Statement
    8. PEP 8: Style Guide for Python Code
      1. PEP 8 and VS Code
      2. Type Hints
    9. Conclusion
  5. 4. Data Analysis with pandas
    1. Foundations: NumPy
      1. NumPy Array
      2. Vectorization and Broadcasting
      3. Universal Functions (ufunc)
      4. Getting and Setting Array Elements
      5. Useful Array Constructors
      6. View vs. Copy
      7. Limitations with NumPy
    2. DataFrame and Series
      1. Index
      2. Columns
    3. Data Manipulation
      1. Selecting Data
      2. Setting Data
      3. Missing Data
      4. Duplicate Data
      5. Arithmetic Operations
      6. Working with Text Columns
      7. Applying a Function
      8. View vs. Copy
    4. Combining DataFrames
      1. Concatenating
      2. Joining and Merging
    5. Descriptive Statistics and Data Aggregation
      1. Descriptive Statistics
      2. Grouping
      3. Pivoting and Melting
    6. Plotting
      1. Matplotlib
      2. Plotly
      3. Other Plotting Libraries
    7. Data Import and Export
      1. Importing CSV files
      2. Exporting CSV files
    8. Time Series
      1. DatetimeIndex
      2. Working with Time Zones
      3. Shifting and Percentage Changes
      4. Rebasing and Correlation
      5. Resampling
      6. Rolling Windows
    9. Limitations with pandas
    10. Conclusion
  6. 5. Reading and Writing Excel Files
    1. Using pandas with Excel Files
      1. Case Study: Excel Reporting
      2. Reading Excel Files with pandas
      3. Writing Excel Files with pandas
      4. Limitations when using pandas with Excel Files
    2. Reader and Writer Packages
      1. When to Use which Package
      2. The excel.py Module
      3. OpenPyXL
      4. XlsxWriter
      5. pyxlsb
      6. xlrd, xlwt and xlutils
    3. Advanced Topics
      1. Working with Big Excel Files
      2. Formatting DataFrames in Excel
      3. Case Study (Revisited): Excel Reporting
    4. Conclusion
  7. 6. Excel Automation
    1. Getting Started with xlwings
      1. Using Excel as Data Viewer
      2. The Excel Object Model
      3. Running VBA Code
    2. Converters, Options and Collections
      1. Working with DataFrames
      2. Converters and Options
      3. Charts, Pictures and Defined Names
      4. Case Study (Re-Revisited): Excel Reporting
    3. Advanced xlwings Topics
      1. xlwings Foundations
      2. Performance
      3. How to Work Around Missing Functionality
    4. Conclusion
  8. 7. Python-Powered Excel Tools
    1. Using Excel as Frontend with xlwings
      1. Excel Add-in
      2. Quickstart Command
      3. Run Main
      4. RunPython Function
    2. Case Study: Python Package Tracker
      1. What We Will Build
      2. Application Structure
      3. Introduction to REST APIs
      4. Introduction to Databases
      5. Frontend
      6. Backend
      7. Debugging
    3. Deployment
      1. Python Dependency
      2. Standalone Workbooks
      3. Configuration Hierarchy
      4. Settings
    4. Conclusion
  9. 8. User-Defined Functions (UDFs)
    1. Getting Started with UDFs
      1. UDF Quickstart
    2. Case Study: Google Trends
      1. Introduction to Google Trends
      2. Working with DataFrames and Dynamic Arrays
      3. Fetching Data from Google Trends
      4. Plotting with UDFs
      5. Debugging UDFs
    3. Advanced UDF Topics
      1. Basic Performance Optimization
      2. Caching
      3. The Sub Decorator
    4. Conclusion
  10. A. Conda Environments
    1. Create a new Conda environment
    2. Disable Auto Activation
    3. Conclusion
  11. B. Mutable vs. Immutable Python Objects
    1. A look behind the scenes
    2. Shallow vs. Deep Copy
    3. Calling Functions with Mutable Objects as Arguments
    4. Functions with Mutable Objects as Default Arguments
    5. Conclusion
  12. C. Code for Reading and Writing Excel Files
    1. The excel.py Module
    2. Reading Excel Sheets in Parallel
      1. Reading Sheets in Parallel with pandas
      2. Reading Sheets in Parallel with OpenPyXL
      3. Reading Sheets in Parallel with xlrd
    3. Case Study (Revisited)
      1. Excel Reporting with pandas and OpenPyXL
      2. Excel Reporting with pandas and XlsxWriter

Product information

  • Title: Python for Excel
  • Author(s): Felix Zumstein
  • Release date: April 2021
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492081005