Book Description
Create data mining algorithms
About This Book Develop a strong strategy to solve predictive modeling problems using the most popular data mining algorithms
 Realworld case studies will take you from novice to intermediate to apply data mining techniques
 Deploy cuttingedge sentiment analysis techniques to realworld social media data using R
This Learning Path is for R developers who are looking to making a career in data analysis or data mining. Those who come across data mining problems of different complexities from web, text, numerical, political, and social media domains will find all information in this single learning path.
What You Will Learn Discover how to manipulate data in R
 Get to know top classification algorithms written in R
 Explore solutions written in R based on R Hadoop projects
 Apply data management skills in handling large data sets
 Acquire knowledge about neural network concepts and their applications in data mining
 Create predictive models for classification, prediction, and recommendation
 Use various libraries on R CRAN for data mining
 Discover more about data potential, the pitfalls, and inferencial gotchas
 Gain an insight into the concepts of supervised and unsupervised learning
 Delve into exploratory data analysis
 Understand the minute details of sentiment analysis
Data mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining  social media mining.
You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and timeseries data. You will also be introduced to solutions written in R based on R Hadoop projects.
Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of endtoend data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects.
After this, you will gain handson experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of sociallygenerated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data.
This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:
 Learning Data Mining with R by Bater Makhabel
 R Data Mining Blueprints by Pradeepta Mishra
 Social Media Mining with R by Nathan Danneman and Richard Heimann
A complete package with which will take you from the basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining  social media mining.
Table of Contents

R: Mining Spatial, Text, Web, and Social Media Data
 Table of Contents
 R: Mining Spatial, Text, Web, and Social Media Data
 R: Mining Spatial, Text, Web, and Social Media Data
 Credits
 Preface

1. Module 1
 1. Warming Up

2. Mining Frequent Patterns, Associations, and Correlations
 An overview of associations and patterns
 Market basket analysis
 Hybrid association rules mining
 Mining sequence dataset
 The R implementation
 Highperformance algorithms
 Time for action
 Summary

3. Classification
 Classification
 Generic decision tree induction
 Highvalue credit card customers classification using ID3
 Web spam detection using C4.5
 Web key resource page judgment using CART
 Trojan traffic identification method and Bayes classification
 Identify spam email and Naïve Bayes classification
 Rulebased classification of player types in computer games and rulebased classification
 Time for action
 Summary
 4. Advanced Classification
 5. Cluster Analysis

6. Advanced Cluster Analysis
 Customer categorization analysis of ecommerce and DBSCAN
 Clustering web pages and OPTICS
 Visitor analysis in the browser cache and DENCLUE
 Recommendation system and STING
 Web sentiment analysis and CLIQUE
 Opinion mining and WAVE clustering
 User search intent and the EM algorithm
 Customer purchase data analysis and clustering highdimensional data
 SNS and clustering graph and network data
 Time for action
 Summary

7. Outlier Detection
 Credit card fraud detection and statistical methods
 Activity monitoring – the detection of fraud involving mobile phones and proximitybased methods
 Intrusion detection and densitybased methods
 Intrusion detection and clusteringbased methods
 Monitoring the performance of the web server and classificationbased methods
 Detecting novelty in text, topic detection, and mining contextual outliers
 Collective outliers on spatial data
 Outlier detection in highdimensional data
 Time for action
 Summary
 8. Mining Stream, Timeseries, and Sequence Data
 9. Graph Mining and Network Analysis
 10. Mining Text and Web Data
 A. Algorithms and Data Structures

2. Module 2

1. Data Manipulation Using Inbuilt R Data
 What is data mining?
 Introduction to the R programming language
 Data type conversion
 Sorting and merging dataframes
 Indexing or subsetting dataframes
 Date and time formatting
 Creating new functions
 Loop concepts  the for loop
 Loop concepts  the repeat loop
 Loop concepts  while conditions
 Apply concepts
 String manipulation
 NA and missing value management
 Missing value imputation techniques
 Summary

2. Exploratory Data Analysis with Automobile Data
 Univariate data analysis
 Bivariate analysis
 Multivariate analysis
 Understanding distributions and transformation
 Interpreting distributions
 Variable binning or discretizing continuous data
 Contingency tables, bivariate statistics, and checking for data normality
 Hypothesis testing
 Nonparametric methods
 Summary
 3. Visualize Diamond Dataset
 4. Regression with Automobile Data
 5. Market Basket Analysis with Groceries Data
 6. Clustering with Ecommerce Data
 7. Building a Retail Recommendation Engine
 8. Dimensionality Reduction
 9. Applying Neural Network to Healthcare Data

1. Data Manipulation Using Inbuilt R Data

3. Module 3
 1. Going Viral
 2. Getting Started with R
 3. Mining Twitter with R
 4. Potentials and Pitfalls of Social Media Data

5. Social Media Mining – Fundamentals
 Key concepts of social media mining
 Good data versus bad data
 Understanding sentiments
 Sentiment polarity – data and classification
 Supervised social media mining – lexiconbased sentiment
 Supervised social media mining – Naive Bayes classifiers
 Unsupervised social media mining – Item Response Theory for text scaling
 Summary
 6. Social Media Mining – Case Studies
 A. Conclusions and Next Steps
 Bibliography
 Index
Product Information
 Title: R: Mining Spatial, Text, Web, and Social Media Data
 Author(s):
 Release date: June 2017
 Publisher(s): Packt Publishing
 ISBN: 9781788293747