Data Exploration in Python

Released November 2015

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781491938324

Start your free trial

Video description

If you're a fledgling data scientist with only cursory statistical training and little experience with real world data sets, you may feel like you're stumbling around in the dark when you're asked to interpret and present data to decision makers. How do you validate the data? What analytic model should you use? How do you differentiate between correlation and causation? How do you ensure that your data is solid and your conclusions are on target?

Allen Downey, Professor of Computer Science at Olin College of Engineering, author of Think Stats, Think Python, and Think Complexity, provides safe passage around the common pitfalls of exploratory data analysis, so you can manage, analyze, and present data with confidence.

Learn the fundamental tools and methodologies used in data science
Discover best practices regarding the ETL (Extract, Transform, and Load) process and data validation
Use the open science framework: practice version control, replication, and data pipelining
Grasp the effectiveness of CDFs (Common Data Formats) in visualizing distributions
Choose the correct analytic model for your data
Comprehend statistical inference, effect size, confidence intervals, and hypothesis testing
Discern the relationship between variables: understand scatter plots and scatter plot alternatives
Understand correlation, linear least squares, linear regression, and logistic regression
Master the Zen of testing your data and your conclusions

Publisher resources

View/Submit Errata

Product information

Title: Data Exploration in Python
Author(s):
Release date: November 2015
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781491938324

book

Mastering Data Mining with Python – Find patterns hidden in your data

by Megan Squire

Learn how to create more powerful data mining applications with this comprehensive Python guide to advance …

book

Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

by Tshepo Chris Nokeri

Apply supervised and unsupervised learning to solve practical and real-world big data problems. This book teaches …

video

Unsupervised Learning for Exploration and Classification of Health Data

by Aileen Nielsen

One of the most exciting and practical goals of combining healthcare with technology is to mine …

book

Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning

by Tshepo Chris Nokeri

Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, …

Data Exploration in Python

Video description

Publisher resources

Table of contents

Product information

You might also like

Mastering Data Mining with Python – Find patterns hidden in your data

Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

Unsupervised Learning for Exploration and Classification of Health Data

Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly