Book description
Master the art of building analytical models using R
About This Book
 Load, wrangle, and analyze your data using the world's most powerful statistical programming language
 Build and customize publicationquality visualizations of powerful and stunning R graphs
 Develop key skills and techniques with R to create and customize data mining algorithms
 Use R to optimize your trading strategy and build up your own risk management system
 Discover how to build machine learning algorithms, prepare data, and dig deep into data prediction techniques with R
Who This Book Is For
This course is for data scientist or quantitative analyst who are looking at learning R and take advantage of its powerful analytical design framework. It's a seamless journey in becoming a fullstack R developer.
What You Will Learn
 Describe and visualize the behavior of data and relationships between data
 Gain a thorough understanding of statistical reasoning and sampling
 Handle missing data gracefully using multiple imputation
 Create diverse types of bar charts using the default R functions
 Familiarize yourself with algorithms written in R for spatial data mining, text mining, and so on
 Understand relationships between market factors and their impact on your portfolio
 Harness the power of R to build machine learning algorithms with realworld data science applications
 Learn specialized machine learning techniques for text mining, big data, and more
In Detail
The R learning path created for you has five connected modules, which are a minicourse in their own right. As you complete each one, you'll have gained key skills and be ready for the material in the next module!
This course begins by looking at the Data Analysis with R module. This will help you navigate the R environment. You'll gain a thorough understanding of statistical reasoning and sampling. Finally, you'll be able to put best practices into effect to make your job easier and facilitate reproducibility.
The second place to explore is R Graphs, which will help you leverage powerful default R graphics and utilize advanced graphics systems such as lattice and ggplot2, the grammar of graphics. You'll learn how to produce, customize, and publish advanced visualizations using this popular and powerful framework.
With the third module, Learning Data Mining with R, you will learn how to manipulate data with R using code snippets and be introduced to mining frequent patterns, association, and correlations while working with R programs.
The Mastering R for Quantitative Finance module pragmatically introduces both the quantitative finance concepts and their modeling in R, enabling you to build a tailormade trading system on your own. By the end of the module, you will be wellversed with various financial techniques using R and will be able to place good bets while making financial decisions.
Finally, we'll look at the Machine Learning with R module. With this module, you'll discover all the analytical tools you need to gain insights from complex data and learn how to choose the correct algorithm for your specific needs. You'll also learn to apply machine learning methods to deal with common tasks, including classification, prediction, forecasting, and so on.
Style and approach
Learn data analysis, data visualization techniques, data mining, and machine learning all using R and also learn to build models in quantitative finance using this powerful language.
Publisher resources
Table of contents

R: Data Analysis and Visualization
 Table of Contents
 R: Data Analysis and Visualization

I. Module 1: Data Analysis with R
 1. RefresheR
 2. The Shape of Data
 3. Describing Relationships
 4. Probability
 5. Using Data to Reason About the World
 6. Testing Hypotheses
 7. Bayesian Methods
 8. Predicting Continuous Variables
 9. Predicting Categorical Variables
 10. Sources of Data
 11. Dealing with Messy Data
 12. Dealing with Large Data
 13. Reproducibility and Best Practices

II. Module 2: R Graphs
 1. R Graphics

2. Basic Graph Functions
 Introduction
 Creating basic scatter plots
 Creating line graphs
 Creating bar charts
 Creating histograms and density plots
 Creating box plots
 Adjusting x and y axes' limits
 Creating heat maps
 Creating pairs plots
 Creating multiple plot matrix layouts
 Adding and formatting legends
 Creating graphs with maps
 Saving and exporting graphs

3. Beyond the Basics – Adjusting Key Parameters
 Introduction
 Setting colors of points, lines, and bars
 Setting plot background colors
 Setting colors for text elements – axis annotations, labels, plot titles, and legends
 Choosing color combinations and palettes
 Setting fonts for annotations and titles
 Choosing plotting point symbol styles and sizes
 Choosing line styles and width
 Choosing box styles
 Adjusting axis annotations and tick marks
 Formatting log axes
 Setting graph margins and dimensions

4. Creating Scatter Plots
 Introduction
 Grouping data points within a scatter plot
 Highlighting grouped data points by size and symbol type
 Labeling data points
 Correlation matrix using pairs plots
 Adding error bars
 Using jitter to distinguish closely packed data points
 Adding linear model lines
 Adding nonlinear model curves
 Adding nonparametric model curves with lowess
 Creating threedimensional scatter plots
 Creating QuantileQuantile plots
 Displaying the data density on axes
 Creating scatter plots with a smoothed density representation

5. Creating Line Graphs and Time Series Charts
 Introduction
 Adding customized legends for multipleline graphs
 Using margin labels instead of legends for multipleline graphs
 Adding horizontal and vertical grid lines
 Adding marker lines at specific x and y values using abline
 Creating sparklines
 Plotting functions of a variable in a dataset
 Formatting time series data for plotting
 Plotting the date or time variable on the x axis
 Annotating axis labels in different humanreadable time formats
 Adding vertical markers to indicate specific time events
 Plotting data with varying timeaveraging periods
 Creating stock charts

6. Creating Bar, Dot, and Pie Charts
 Introduction
 Creating bar charts with more than one factor variable
 Creating stacked bar charts
 Adjusting the orientation of bars – horizontal and vertical
 Adjusting bar widths, spacing, colors, and borders
 Displaying values on top of or next to the bars
 Placing labels inside bars
 Creating bar charts with vertical error bars
 Modifying dot charts by grouping variables
 Making better, readable pie charts with clockwiseordered slices
 Labeling a pie chart with percentage values for each slice
 Adding a legend to a pie chart

7. Creating Histograms
 Introduction
 Visualizing distributions as count frequencies or probability densities
 Setting the bin size and the number of breaks
 Adjusting histogram styles – bar colors, borders, and axes
 Overlaying a density line over a histogram
 Multiple histograms along the diagonal of a pairs plot
 Histograms in the margins of line and scatter plots

8. Box and Whisker Plots
 Introduction
 Creating box plots with narrow boxes for a small number of variables
 Grouping over a variable
 Varying box widths by the number of observations
 Creating box plots with notches
 Including or excluding outliers
 Creating horizontal box plots
 Changing the box styling
 Adjusting the extent of plot whiskers outside the box
 Showing the number of observations
 Splitting a variable at arbitrary values into subsets
 9. Creating Heat Maps and Contour Plots
 10. Creating Maps

11. Data Visualization Using Lattice
 Introduction
 Creating bar charts
 Creating stacked bar charts
 Creating bar charts to visualize crosstabulation
 Creating a conditional histogram
 Visualizing distributions through a kerneldensity plot
 Creating a normal QQ plot
 Visualizing an empirical Cumulative Distribution Function
 Creating a boxplot
 Creating a conditional scatter plot
 12. Data Visualization Using ggplot2
 13. Inspecting Large Datasets
 14. Threedimensional Visualizations

15. Finalizing Graphs for Publications and Presentations
 Introduction
 Exporting graphs in highresolution image formats – PNG, JPEG, BMP, and TIFF
 Exporting graphs in vector formats – SVG, PDF, and PS
 Adding mathematical and scientific notations (typesetting)
 Adding text descriptions to graphs
 Using graph templates
 Choosing font families and styles under Windows, Mac OS X, and Linux
 Choosing fonts for PostScripts and PDFs

III. Module 3: Learning Data Mining with R
 1. Warming Up

2. Mining Frequent Patterns, Associations, and Correlations
 An overview of associations and patterns
 Market basket analysis
 Hybrid association rules mining
 Mining sequence dataset
 The R implementation
 Highperformance algorithms

3. Classification
 Classification
 Generic decision tree induction
 Highvalue credit card customers classification using ID3
 Web spam detection using C4.5
 Web key resource page judgment using CART
 Trojan traffic identification method and Bayes classification
 Identify spam email and Naïve Bayes classification
 Rulebased classification of player types in computer games and rulebased classification
 4. Advanced Classification
 5. Cluster Analysis

6. Advanced Cluster Analysis
 Customer categorization analysis of ecommerce and DBSCAN
 Clustering web pages and OPTICS
 Visitor analysis in the browser cache and DENCLUE
 Recommendation system and STING
 Web sentiment analysis and CLIQUE
 Opinion mining and WAVE clustering
 User search intent and the EM algorithm
 Customer purchase data analysis and clustering highdimensional data
 SNS and clustering graph and network data

7. Outlier Detection
 Credit card fraud detection and statistical methods
 Activity monitoring – the detection of fraud involving mobile phones and proximitybased methods
 Intrusion detection and densitybased methods
 Intrusion detection and clusteringbased methods
 Monitoring the performance of the web server and classificationbased methods
 Detecting novelty in text, topic detection, and mining contextual outliers
 Collective outliers on spatial data
 Outlier detection in highdimensional data
 8. Mining Stream, Timeseries, and Sequence Data
 9. Graph Mining and Network Analysis
 10. Mining Text and Web Data

IV. Module 4: Mastering R for Quantitative Finance
 1. Time Series Analysis
 2. Factor Models
 3. Forecasting Volume
 4. Big Data – Advanced Analytics
 5. FX Derivatives
 6. Interest Rate Derivatives and Models

7. Exotic Options
 A general pricing approach
 The role of dynamic hedging
 How R can help a lot
 A glance beyond vanillas
 Greeks – the link back to the vanilla world
 Pricing the Doublenotouch option
 Another way to price the Doublenotouch option
 The life of a Doublenotouch option – a simulation
 Exotic options embedded in structured products
 References
 8. Optimal Hedging
 9. Fundamental Analysis
 10. Technical Analysis, Neural Networks, and Logoptimal Portfolios
 11. Asset and Liability Management
 12. Capital Adequacy
 13. Systemic Risks

V. Module 5: Machine Learning with R module
 1. Introducing Machine Learning

2. Managing and Understanding Data
 R data structures
 Managing data with R

Exploring and understanding data
 Exploring the structure of data

Exploring numeric variables
 Measuring the central tendency – mean and median
 Measuring spread – quartiles and the fivenumber summary
 Visualizing numeric variables – boxplots
 Visualizing numeric variables – histograms
 Understanding numeric data – uniform and normal distributions
 Measuring spread – variance and standard deviation
 Exploring categorical variables
 Exploring relationships between variables
 3. Lazy Learning – Classification Using Nearest Neighbors
 4. Probabilistic Learning – Classification Using Naive Bayes

5. Divide and Conquer – Classification Using Decision Trees and Rules
 Understanding decision trees
 Example – identifying risky bank loans using C5.0 decision trees
 Understanding classification rules
 Example – identifying poisonous mushrooms with rule learners

6. Forecasting Numeric Data – Regression Methods
 Understanding regression
 Example – predicting medical expenses using linear regression
 Understanding regression trees and model trees
 Example – estimating the quality of wines with regression trees and model trees
 7. Black Box Methods – Neural Networks and Support Vector Machines
 8. Finding Patterns – Market Basket Analysis Using Association Rules
 9. Finding Groups of Data – Clustering with kmeans
 10. Evaluating Model Performance
 11. Improving Model Performance
 12. Specialized Machine Learning Topics

A. Reflect and Test Yourself Answers

Module 1: Data Analysis with R
 Chapter 1: RefresheR
 Chapter 2: The Shape of Data
 Chapter 3: Describing Relationships
 Chapter 4: Probability
 Chapter 5: Using Data to Reason About the World
 Chapter 6: Testing Hypotheses
 Chapter 7: Bayesian Methods
 Chapter 8: Predicting Continuous Variables
 Chapter 9: Predicting Categorical Variables
 Chapter 10: Sources of Data
 Chapter 11: Dealing with Messy Data
 Chapter 12: Dealing with Large Data

Module 2: R Graphs
 Chapter 1: R Graphics
 Chapter 2: Basic Graph Functions
 Chapter 3: Beyond the Basics – Adjusting Key Parameters
 Chapter 4: Creating Scatter Plots
 Chapter 5: Creating Line Graphs and Time Series Charts
 Chapter 6: Creating Bar, Dot, and Pie Charts
 Chapter 7: Creating Histograms
 Chapter 8: Box and Whisker Plots
 Chapter 9: Creating Heat Maps and Contour Plots
 Module 4: Mastering R for Quantitative Finance

Module 5: Machine Learning with R
 Chapter 1: Introducing Machine Learning
 Chapter 2: Managing and Understanding Data
 Chapter 3: Lazy Learning – Classification Using Nearest Neighbors
 Chapter 4: Probabilistic Learning – Classification Using Naive Bayes
 Chapter 5: Divide and Conquer – Classification Using Decision Trees and Rules
 Chapter 6: Forecasting Numeric Data – Regression Methods
 Chapter 7: Black Box Methods – Neural Networks and Support Vector Machines
 Chapter 8: Finding Patterns – Market Basket Analysis Using Association Rules

Module 1: Data Analysis with R
 B. Bibliography
 Index
Product information
 Title: R: Data Analysis and Visualization
 Author(s):
 Release date: June 2016
 Publisher(s): Packt Publishing
 ISBN: 9781786463500
You might also like
book
Machine Learning with R, the tidyverse, and mlr
Machine Learning with R, the tidyverse, and mlr gets you started in machine learning using R …
book
Storytelling with Data: A Data Visualization Guide for Business Professionals
Don't simply show your data—tell a story with it! Storytelling with Data teaches you the fundamentals …
book
R Data Analysis Cookbook  Second Edition
Over 80 recipes to help you breeze through your data analysis projects using R About This …
book
Statistics for Machine Learning
Build Machine Learning models with a sound statistical understanding. About This Book Learn about the statistics …