Book description
How do you take your data analysis skills beyond Excel to the next level? By learning just enough Python to get stuff done. This hands-on guide shows non-programmers like you how to process information that’s initially too messy or difficult to access. You don't need to know a thing about the Python programming language to get started.
Through various step-by-step exercises, you’ll learn how to acquire, clean, analyze, and present data efficiently. You’ll also discover how to automate your data process, schedule file- editing and clean-up tasks, process larger datasets, and create compelling stories with data you obtain.
- Quickly learn basic Python syntax, data types, and language concepts
- Work with both machine-readable and human-consumable data
- Scrape websites and APIs to find a bounty of useful information
- Clean and format data to eliminate duplicates and errors in your datasets
- Learn when to standardize data and when to test and script data cleanup
- Explore and analyze your datasets with new Python libraries and techniques
- Use Python solutions to automate your entire data-wrangling process
Table of contents
- Preface
- 1. Introduction to Python
- 2. Python Basics
- 3. Data Meant to Be Read by Machines
- 4. Working with Excel Files
- 5. PDFs and Problem Solving in Python
- 6. Acquiring and Storing Data
- 7. Data Cleanup: Investigation, Matching, and Formatting
- 8. Data Cleanup: Standardizing and Scripting
- 9. Data Exploration and Analysis
- 10. Presenting Your Data
- 11. Web Scraping: Acquiring and Storing Data from the Web
- 12. Advanced Web Scraping: Screen Scrapers and Spiders
- 13. APIs
- 14. Automation and Scaling
- 15. Conclusion
- A. Comparison of Languages Mentioned
- B. Python Resources for Beginners
- C. Learning the Command Line
-
D. Advanced Python Setup
- Step 1: Install GCC
- Step 2: (Mac Only) Install Homebrew
- Step 3: (Mac Only) Tell Your System Where to Find Homebrew
- Step 4: Install Python 2.7
- Step 5: Install virtualenv (Windows, Mac, Linux)
- Step 6: Set Up a New Directory
- Step 7: Install virtualenvwrapper
- Learning About Our New Environment (Windows, Mac, Linux)
- Advanced Setup Review
-
E. Python Gotchas
- Hail the Whitespace
- The Dreaded GIL
- = Versus == Versus is, and When to Just Copy
- Default Function Arguments
- Python Scope and Built-Ins: The Importance of Variable Names
- Defining Objects Versus Modifying Objects
- Changing Immutable Objects
- Type Checking
- Catching Multiple Exceptions
- The Power of Debugging
- F. IPython Hints
- G. Using Amazon Web Services
- Index
Product information
- Title: Data Wrangling with Python
- Author(s):
- Release date: February 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491948811
You might also like
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
Automate the Boring Stuff with Python
Automate the Boring Stuff with Python teaches simple programming skills to automate everyday computer tasks.
book
Tiny Python Projects
The projects are tiny, but the rewards are big: each chapter in Tiny Python Projects challenges …
book
Python Workout
Python Workout presents 50 exercises that focus on key Python 3 features. In it, expert Python …