O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Science with Python and R (Anaconda Video Series)

Video Description

9+ Hours of Video Training

Data Science with Python and R LiveLessons is tailored to beginner data scientists seeking to use Python or R for data science. This course includes fundamentals of data preparation, data analysis, data visualization, machine learning, and interactive data science applications. Students will learn how to build predictive models and how to create interactive visual applications for their line of business using the Anaconda platform. This course will introduce data scientists to using Python and R for building on an ecosystem of hundreds of high performance open source tools.

The companion Jupyter notebooks for these LiveLessons are available at https://anaconda.org/datasciencepythonr.

About the Instructors

Ian Stokes-Rees, a computational scientist at Continuum Analytics, has been with the company since the beginning and was the team lead for the Collaboration component of Anaconda Enterprise. One of Ian’s key focus areas includes working with Continuum clients to leverage Open Data Science to meet their analytics needs.

Prior to joining Continuum Analytics, Ian was a lecturer and researcher in computational science at the Harvard Medical School and School of Engineering. He has also taught a range of introductory and advanced scientific computing courses at Harvard Medical School, the University of Oxford, and the European Molecular Biology Laboratory. Ian received his Ph.D. from the University of Oxford and has run and delivered hundreds of hours of industry training courses in business data modeling and web services architectures to thousands of engineers and scientists.

About Anaconda Powered by Continuum Analytics

Anaconda is the leading Open Data Science platform powered by Python, the fastest growing data science language with more than 11 million downloads to date. Continuum Analytics is the creator and driving force behind Anaconda, empowering leading businesses across industries worldwide with tools to identify patterns in data, uncover key insights and transform basic data into a goldmine of intelligence to solve the world’s most challenging problems. Anaconda puts superpowers into the hands of people who are changing the world. Learn more at continuum.io.

Skill Level

  • Beginner level for data scientists

What You Will Learn

  • Use Anaconda and Jupyter notebooks
  • Understand Open Data Science concepts, roles, and workflows
  • Wrangle data with Pandas
  • Understand Anaconda Enterprise and collaboration workflows
  • Create interactive visualizations with Bokeh
  • Use Conda package management
  • Use R for data processing and visualization
  • Build statistical and predictive models
  • Use Excel and Python with Anaconda Fusion
  • Understand and use Mosaic for databases with distributed data
  • Understand distributed and parallel computing with Dask

Course Requirements

  • Basic experience in Python programming
  • Anaconda installed. Python. Anaconda downloads are available for Apple OS X , Microsoft Windows , and most Linux distributions. Optionally, Anaconda Trial, which includes features in the paid subscriptions, is available for download.

About LiveLessons Video Training

The LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home and Office Technologies, Business and Management, and more. View all LiveLessons on InformIT at http://www.informit.com/livelessons.

Table of Contents

  1. Introduction
    1. Data Science with Python and R: Introduction 00:04:33
  2. Lesson 1: Open Data Science for Everyone
    1. Learning objectives 00:00:37
    2. 1.1 Use Anaconda Repository for data science artifacts 00:03:30
    3. 1.2 Use Anaconda Navigator to open and run Jupyter Notebooks 00:01:58
    4. 1.3 Perform fundamental Jupyter operations 00:02:51
    5. 1.4 Ingest, analyze and clean data with Pandas 00:08:59
    6. 1.5 Visualize data with Bokeh 00:04:34
    7. 1.6 Create machine learning and predictive modeling with Scikit-Learn 00:11:49
  3. Lesson 2: Background Concepts for Open Data Science
    1. Learning objectives 00:00:29
    2. 2.1 Understand the concept of Open Data Science 00:02:40
    3. 2.2 Identify the different personas on an Open Data Science team 00:06:47
    4. 2.3 Understand Open Data Science workflows 00:09:14
  4. Lesson 3: Data Wrangling with Pandas
    1. Learning objectives 00:00:56
    2. 3.1 Load, view and plot Pandas DataFrames 00:12:30
    3. 3.2 Modify content and create new columns 00:12:49
    4. 3.3 Use boolean masks for data selection 00:11:50
    5. 3.4 Read data from disk 00:14:45
    6. 3.5 Group data 00:12:59
    7. 3.6 Connect to a database 00:21:44
    8. 3.7 Work with time series data 00:09:29
    9. 3.8 Read and write Excel files 00:14:49
    10. 3.9 Publish notebooks to Anaconda Cloud 00:04:21
  5. Lesson 4: Anaconda Platform Overview
    1. Learning objectives 00:01:06
    2. 4.1 Describe the Anaconda Distribution 00:05:05
    3. 4.2 Identify what Conda is used for 00:03:49
    4. 4.3 Relate Anaconda Enterprise components 00:12:04
    5. 4.4 Identify core technology components 00:05:20
    6. 4.5 Describe typical data science workflows 00:02:59
    7. 4.6 Create projects in Anaconda enterprise with a team 00:12:23
  6. Lesson 5: Creating Interactive Visualizations with Bokeh
    1. Learning objectives 00:00:49
    2. 5.1 Describe Bokeh 00:06:18
    3. 5.2 Plot Pandas DataFrames with bokeh.charts 00:09:04
    4. 5.3 Manage plot construction with bokeh.plotting 00:15:01
    5. 5.4 Use widgets and plot linking for interactivity 00:19:06
    6. 5.5 Create web plots 00:03:37
    7. 5.6 Create data apps using Bokeh Server 00:07:40
  7. Lesson 6: Conda Package Management
    1. Learning objectives 00:01:18
    2. 6.1 Install packages from Navigator 00:12:01
    3. 6.2 Add channels from Navigator 00:05:34
    4. 6.3 Upgrade, downgrade and remove packages from Navigator 00:04:41
    5. 6.4 Create a new environment from Navigator 00:07:10
    6. 6.5 Select Conda environments and Jupyter kernels 00:10:35
    7. 6.6 Use Conda from the command line 00:16:48
    8. 6.7 Understand the difference between pip and conda 00:17:13
    9. 6.8 Keep pip and conda up to date 00:02:18
    10. 6.9 Export, save, and share Conda environments 00:13:03
    11. 6.10 Find packages on Anaconda Cloud and from Conda-Forge 00:09:48
  8. Lesson 7: Data Processing and Visualization in R
    1. Learning objectives 00:00:48
    2. 7.1 Configure an R analytics environment 00:06:22
    3. 7.2 Access and process data with dplyr and tidyr 00:15:11
    4. 7.3 Create visualizations with ggplot 00:28:31
    5. 7.4 Use linear models for predictive analytics 00:17:21
    6. 7.5 Create interactive visualizations with rBokeh and Shiny 00:12:50
    7. 7.6 Bridge between R and Python with rpy2 00:16:07
  9. Lesson 8: Build Statistical and Predictive Models
    1. Learning objectives 00:00:34
    2. 8.1 Use Scikit-Learn to create a predictive model 00:08:37
    3. 8.2 Generate predictions with a model 00:05:35
    4. 8.3 Score a model 00:10:38
    5. 8.4 Visualize model performance 00:03:31
  10. Lesson 9: Excel and Python with Anaconda Fusion
    1. Learning objectives 00:00:36
    2. 9.1 Understand which problems Fusion solves 00:02:35
    3. 9.2 Install and start Fusion 00:05:32
    4. 9.3 Connect spreadsheets to codesheets 00:06:20
  11. Lesson 10: Databases and Distributed Data with Mosaic
    1. Learning objectives 00:00:33
    2. 10.1 Understand which problems Mosaic solves 00:01:47
    3. 10.2 Install and start Mosaic 00:01:06
    4. 10.3 Use Mosaic to register datasets and create data views 00:06:51
  12. Lesson 11: Distributed and Parallel Computing with Dask
    1. Learning objectives 00:00:36
    2. 11.1 Describe Dask in relation to Pandas 00:07:16
    3. 11.2 Profile the creation of Dask dataframes 00:13:17
    4. 11.3 Analyze and plot Dask data 00:06:33
  13. Summary
    1. Data Science with Python and R: Summary 00:02:41