Your one-stop guide to building an efficient data science pipeline using Jupyter
About This Book
- Get the most out of your Jupyter notebook to complete the trickiest of tasks in Data Science
- Learn all the tasks in the data science pipeline - from data acquisition to visualization - and implement them using Jupyter
- Get ahead of the curve by mastering all the applications of Jupyter for data science with this unique and intuitive guide
Who This Book Is For
This book targets students and professionals who wish to master the use of Jupyter to perform a variety of data science tasks. Some programming experience with R or Python, and some basic understanding of Jupyter, is all you need to get started with this book.
What You Will Learn
- Understand why Jupyter notebooks are a perfect fit for your data science tasks
- Perform scientific computing and data analysis tasks with Jupyter
- Interpret and explore different kinds of data visually with charts, histograms, and more
- Extend SQL's capabilities with Jupyter notebooks
- Combine the power of R and Python 3 with Jupyter to create dynamic notebooks
- Create interactive dashboards and dynamic presentations
- Master the best coding practices and deploy your Jupyter notebooks efficiently
Jupyter Notebook is a web-based environment that enables interactive computing in notebook documents. It allows you to create documents that contain live code, equations, and visualizations. This book is a comprehensive guide to getting started with data science using the popular Jupyter notebook.
If you are familiar with Jupyter notebook and want to learn how to use its capabilities to perform various data science tasks, this is the book for you! From data exploration to visualization, this book will take you through every step of the way in implementing an effective data science pipeline using Jupyter. You will also see how you can utilize Jupyter's features to share your documents and codes with your colleagues. The book also explains how Python 3, R, and Julia can be integrated with Jupyter for various data science tasks.
By the end of this book, you will comfortably leverage the power of Jupyter to perform various tasks in data science successfully.
Style and approach
This book is a perfect blend of concepts and practical examples, written in a way that is very easy to understand and implement. It follows a logical flow where you will be able to build on your understanding of the different Jupyter features with every chapter.
Table of Contents
Jupyter and Data Science
- Jupyter concepts
A first look at the Jupyter user interface
- Detailing the Jupyter tabs
- What actions can I perform with Jupyter?
- What objects can Jupyter manipulate?
- Viewing the Jupyter project display
- How does it look when we execute scripts?
- Industry data science usage
- Real life examples
- Using Docker with Jupyter
- How to share notebooks with others
- How can you secure a notebook?
Working with Analytical Data on Jupyter
- Data scraping with a Python notebook
- Using heavy-duty data processing functions in Jupyter
- Using SciPy in Jupyter
- Expanding on panda data frames in Jupyter
- Data Visualization and Prediction
- Data Mining and SQL Queries
- R with Jupyter
- Reading a CSV file
- Reading another CSV file
- Manipulating data with dplyr
- Sampling a dataset
- Tidying up data with tidyr
- Jupyter Dashboards
- Converting JSON to CSV
- Evaluating Yelp reviews
- Using Python to compare ratings
- Visualizing average ratings by cuisine
- Arbitrary search of ratings
- Determining relationships between number of ratings and ratings
- Machine Learning Using Jupyter
Optimizing Jupyter Notebooks
- Deploying notebooks
Optimizing your script
- Optimizing your Python scripts
- Optimizing your R scripts
- Monitoring Jupyter
- Caching your notebook
- Securing a notebook
- Scaling Jupyter Notebooks
- Sharing Jupyter Notebooks
- Converting a notebook
- Versioning a notebook
- Title: Jupyter for Data Science
- Release date: October 2017
- Publisher(s): Packt Publishing
- ISBN: 9781785880070