Skip to Content
Advanced R Programming
on-demand course

Advanced R Programming

with Jared Lander
December 2015
Intermediate to advanced
3h 18m
English
Pearson

Overview

Overview

 

Alternative Backends for R LiveLessons teaches R programmers techniques for dealing with large data, both in memory and in databases.

 

Description

 

In this video training Jared starts with some common data manipulation operations using various base R functions and packages like plyr, comparing the speed of in memory calculations. He then demonstrates more advanced techniques for accomplishing the same task such as data.table, dplyr, Rcpp and parallel computation for increased speed. Finally, for when data size is an even bigger factor than speed he introduces external memory and database techniques using bibmemory, ff, SciDB, dplyr and Hadoop.

 

About the Instructor

 

Jared P. Lander is the Founder and CEO of Lander Analytics, the Organizer of the New York Open Statistical Programming Meetup and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. Jared oversees the long-term direction of the company and acts as Lead Data Scientist, researching the best strategy, models and algorithms for modern data needs. This is in addition to his client-facing consulting and training. He specializes in data management, multilevel models, machine learning, generalized linear models, data management, visualization and statistical computing. He is the author of  R for Everyone, a book about R Programming geared toward Data Scientists and Non-Statisticians alike. The book is available from Amazon, Barnes & Noble, and InformIT. The material is drawn from the classes he teaches at Columbia and is incorporated into his corporate training. Very active in the data community, Jared is a frequent speaker at conferences, universities and meetups around the world. He is a member of the 2014 Strata New York selection committee.

Skill Level

  • Intermediate
  • Advanced

 

What You Will Learn

  • Basic Aggregation
  • plyr
  • dplyr
  • data.table
  • Rcpp
  • Parallel Processing
  • Code Benchmarking

 

Who Should Take This Course

  • R programmers who already have an intermediate level of knowledge such as that gained from Reading  R for Everyone.

 

Course Requirements

  • Basic Programming Skills
  • Proficiency in R, including working with packages

 

Table of Contents

 

Lesson 1: Reading XML Data

1.1.  Read HTML Table

1.2.  Use xpath for complex searches in HTML

1.3.  xmlToList for easier parsing

 

Lesson 2: Faster Group Operations

2.1.  Aggregate normally

2.2.  tapply

2.3.  ddply

2.4.  data.table

2.5.  dplyr

2.6.  ddply parallel

2.7.  foreach

2.8.  dplyr with a database

 

Lesson 3: Rcpp for faster code

3.1.  Basics of C++ with R

3.2.  Writing a C++ function for R

3.3.  Using C++ code in an R package

 

Lesson 4: Advanced Machine Learning

4.1.  Recommendation Engine with RecommenderLab

4.2.  Text Mining with RTextTools

 

Lesson 5: Network Analysis

5.1.  igraph

5.2.  Reading edgelists

5.3.  Base plots

5.4.  tkplots

5.5.  rglplots

5.6.  Network metrics like diameter, shortest path

5.7.  Node metrics like centrality and betweenness

 

Lesson 6: Advanced Graphics

6.1.  ggvis

6.2.  rCharts

 

About LiveLessons Video Training

 

LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home and Office Technologies, Business and Management, and more. View all LiveLessons on InformIT at: http://www.informit.com/livelessons.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Learning Path: R Programming for Data Analysts

Learning Path: R Programming for Data Analysts

Jared P. Lander
Open Source Software Superstream Series: C++

Open Source Software Superstream Series: C++

Kelsey Hightower, Matt Klein, Alex Gallego, Timur Doumler, Daisy Hollman, Sy Brand, Rob Blafford, Vadim Plakhtinskiy
Advanced R

Advanced R

Hadley Wickham

Publisher Resources

ISBN: 9780134052700