Book description
Learn how to apply powerful data analysis techniques with popular open source Python modules
In Detail
Python is a multi-paradigm programming language well suited for both object-oriented application development as well as functional design patterns. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning. It will give you velocity and promote high productivity.
This book will teach novices about data analysis with Python in the broadest sense possible, covering everything from data retrieval, cleaning, manipulation, visualization, and storage to complex analysis and modeling. It focuses on a plethora of open source Python modules such as NumPy, SciPy, matplotlib, pandas, IPython, Cython, scikit-learn, and NLTK. In later chapters, the book covers topics such as data visualization, signal processing, and time-series analysis, databases, predictive analytics and machine learning. This book will turn you into an ace data analyst in no time.
What You Will Learn
- Install open source Python modules on various platforms
- Get to know about the fundamentals of NumPy including arrays
- Manipulate data with pandas
- Retrieve, process, store, and visualize data
- Understand signal processing and time-series data analysis
- Work with relational and NoSQL databases
- Discover more about data modeling and machine learning
- Get to grips with interoperability and cloud computing
Publisher resources
Table of contents
-
Python Data Analysis
- Table of Contents
- Python Data Analysis
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- 1. Getting Started with Python Libraries
-
2. NumPy Arrays
- The NumPy array object
- Creating a multidimensional array
- Selecting NumPy array elements
- NumPy numerical types
- One-dimensional slicing and indexing
- Manipulating array shapes
- Creating array views and copies
- Fancy indexing
- Indexing with a list of locations
- Indexing NumPy arrays with Booleans
- Broadcasting NumPy arrays
- Summary
- 3. Statistics and Linear Algebra
-
4. pandas Primer
- Installing and exploring pandas
- pandas DataFrames
- pandas Series
- Querying data in pandas
- Statistics with pandas DataFrames
- Data aggregation with pandas DataFrames
- Concatenating and appending DataFrames
- Joining DataFrames
- Handling missing values
- Dealing with dates
- Pivot tables
- Remote data access
- Summary
-
5. Retrieving, Processing, and Storing Data
- Writing CSV files with NumPy and pandas
- Comparing the NumPy .npy binary format and pickling pandas DataFrames
- Storing data with PyTables
- Reading and writing pandas DataFrames to HDF5 stores
- Reading and writing to Excel with pandas
- Using REST web services and JSON
- Reading and writing JSON with pandas
- Parsing RSS and Atom feeds
- Parsing HTML with Beautiful Soup
- Summary
- 6. Data Visualization
- 7. Signal Processing and Time Series
- 8. Working with Databases
- 9. Analyzing Textual Data and Social Media
- 10. Predictive Analytics and Machine Learning
- 11. Environments Outside the Python Ecosystem and Cloud Computing
- 12. Performance Tuning, Profiling, and Concurrency
- A. Key Concepts
- B. Useful Functions
- C. Online Resources
- Index
Product information
- Title: Python Data Analysis
- Author(s):
- Release date: October 2014
- Publisher(s): Packt Publishing
- ISBN: 9781783553358
You might also like
book
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
book
Python Data Analysis Cookbook
Over 140 practical recipes to help you make sense of your data with ease and build …
book
Learning Python Networking - Second Edition
Achieve improved network programmability and automation by leveraging powerful network programming concepts, algorithms, and tools Key …
book
Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud
This is the eBook of the printed book and may not include any media, website access …