Skip to Content
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
book

Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning

by Tshepo Chris Nokeri
March 2021
Beginner to intermediate
261 pages
4h 7m
English
Apress

Overview

Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, and deep learning. This book teaches you how to select variables, optimize hyper parameters, develop pipelines, and train, test, and validate machine and deep learning models. Each chapter includes a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model.

The book covers parametric methods or linear models that combat under- or over-fitting using techniques such as Lasso and Ridge. It includes complex regression analysis with time series smoothing, decomposition, and forecasting. It takes a fresh look at non-parametric models for binary classification (logistic regression analysis) and ensemble methods such as decision trees, support vector machines, and naive Bayes. It covers the most popular non-parametric method for time-event data (the Kaplan-Meier estimator). It also covers ways of solving classification problems using artificial neural networks such as restricted Boltzmann machines, multi-layer perceptrons, and deep belief networks. The book discusses unsupervised learning clustering techniques such as the K-means method, agglomerative and Dbscan approaches, and dimension reduction techniques such as Feature Importance, Principal Component Analysis, and Linear Discriminant Analysis. And it introduces driverless artificial intelligence using H2O.

After reading this book, you will be able to develop, test, validate, and optimize statistical machine learning and deep learning models, and engineer, visualize, and interpret sets of data.



What You Will Learn
  • Design, develop, train, and validate machine learning and deep learning models
  • Find optimal hyper parameters for superior model performance
  • Improve model performance using techniques such as dimension reduction and regularization
  • Extract meaningful insights for decision making using data visualization



Who This Book Is For

Beginning and intermediate level data scientists and machine learning engineers


Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Reproducible Data Science with Pachyderm

Reproducible Data Science with Pachyderm

Svetlana Karslioglu
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Publisher Resources

ISBN: 9781484268704Purchase LinkPublisher Website