Book description
Learn to effectively manage data and execute data science projects from start to finish using Python
Key Features
- Understand and utilize data science tools in Python, such as specialized machine learning algorithms and statistical modeling
- Build a strong data science foundation with the best data science tools available in Python
- Add value to yourself, your organization, and society by extracting actionable insights from raw data
Book Description
Practical Data Science with Python teaches you core data science concepts, with real-world and realistic examples, and strengthens your grip on the basic as well as advanced principles of data preparation and storage, statistics, probability theory, machine learning, and Python programming, helping you build a solid foundation to gain proficiency in data science.
The book starts with an overview of basic Python skills and then introduces foundational data science techniques, followed by a thorough explanation of the Python code needed to execute the techniques. You'll understand the code by working through the examples. The code has been broken down into small chunks (a few lines or a function at a time) to enable thorough discussion.
As you progress, you will learn how to perform data analysis while exploring the functionalities of key data science Python packages, including pandas, SciPy, and scikit-learn. Finally, the book covers ethics and privacy concerns in data science and suggests resources for improving data science skills, as well as ways to stay up to date on new data science developments.
By the end of the book, you should be able to comfortably use Python for basic data science projects and should have the skills to execute the data science process on any data source.
What you will learn
- Use Python data science packages effectively
- Clean and prepare data for data science work, including feature engineering and feature selection
- Data modeling, including classic statistical models (such as t-tests), and essential machine learning algorithms, such as random forests and boosted models
- Evaluate model performance
- Compare and understand different machine learning methods
- Interact with Excel spreadsheets through Python
- Create automated data science reports through Python
- Get to grips with text analytics techniques
Who this book is for
The book is intended for beginners, including students starting or about to start a data science, analytics, or related program (e.g. Bachelor's, Master's, bootcamp, online courses), recent college graduates who want to learn new skills to set them apart in the job market, professionals who want to learn hands-on data science techniques in Python, and those who want to shift their career to data science.
The book requires basic familiarity with Python. A "getting started with Python" section has been included to get complete novices up to speed.
Table of contents
- Preface
- Part I - An Introduction and the Basics
- Introduction to Data Science
- Getting Started with Python
- Part II - Dealing with Data
- SQL and Built-in File Handling Modules in Python
-
Loading and Wrangling Data with Pandas and NumPy
-
Data wrangling and analyzing iTunes data
- Loading and saving data with Pandas
- Exploratory Data Analysis (EDA) and basic data cleaning with Pandas
- Cleaning data
- Filtering DataFrames
- Data transformations
- Using replace, map, and apply to clean and transform data
- Using GroupBy
- Writing DataFrames to disk
- Wrangling and analyzing Bitcoin price data
- Understanding NumPy basics
- Using NumPy mathematical functions
- Test your knowledge
- Summary
-
Data wrangling and analyzing iTunes data
- Exploratory Data Analysis and Visualization
- Data Wrangling Documents and Spreadsheets
- Web Scraping
- Part III - Statistics for Data Science
-
Probability, Distributions, and Sampling
- Probability basics
-
Distributions
- The normal distribution and using scipy to generate distributions
- Descriptive statistics of distributions
- Fitting distributions to data to get parameters
- The Student's t-distribution
- The Bernoulli distribution
- The binomial distribution
- The uniform distribution
- The exponential and Poisson distributions
- The Weibull distribution
- The Zipfian distribution
- Sampling from data
- Test your knowledge
- Summary
- Statistical Testing for Data Science
- Part IV - Machine Learning
-
Preparing Data for Machine Learning: Feature Selection, Feature Engineering, and Dimensionality Reduction
- Types of machine learning
-
Feature selection
- The curse of dimensionality
- Overfitting and underfitting, and the bias-variance trade-off
- Methods for feature selection
- Variance thresholding – removing features with too much and too little variance
- Univariate statistics feature selection
- Mutual information score and chi-squared
- Using the univariate statistics for feature selection
- Feature engineering
- Dimensionality reduction
- Test your knowledge
- Summary
- Machine Learning for Classification
- Evaluating Machine Learning Classification Models and Sampling for Classification
- Machine Learning with Regression
- Optimizing Models and Using AutoML
- Tree-Based Machine Learning Models
- Support Vector Machine (SVM) Machine Learning Models
- Part V - Text Analysis and Reporting
- Clustering with Machine Learning
- Working with Text
- Part VI - Wrapping Up
- Data Storytelling and Automated Reporting/Dashboarding
- Ethics and Privacy
- Staying Up to Date and the Future of Data Science
- Other Books You May Enjoy
- Index
Product information
- Title: Practical Data Science with Python
- Author(s):
- Release date: September 2021
- Publisher(s): Packt Publishing
- ISBN: 9781801071970
You might also like
book
Python for Data Science
Python is an ideal choice for accessing, manipulating, and gaining insights from data of all kinds. …
book
Interpretable Machine Learning with Python
A deep and detailed dive into the key aspects and challenges of machine learning interpretability, complete …
book
Python Data Science Essentials - Third Edition
Gain useful insights from your data using popular data science tools Key Features A one-stop guide …
book
Python: End-to-end Data Analysis
Leverage the power of Python to clean, scrape, analyze, and visualize your data About This Book …