O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Social Media Mining with R

Book Description

Extract valuable data from your social media sites and make better business decisions using R

About This Book

  • Explore the social media APIs in R to capture data and tame it
  • Employ the machine learning capabilities of R to gain optimal business value
  • A hands-on guide with real-world examples to help you take advantage of the vast opportunities that come with social media data

Who This Book Is For

If you have basic knowledge of R in terms of its libraries and are aware of different machine learning techniques, this book is for you. Those with experience in data analysis who are interested in mining social media data will find this book useful.

What You Will Learn

  • Access APIs of popular social media sites and extract data
  • Perform sentiment analysis and identify trending topics
  • Measure CTR performance for social media campaigns
  • Implement exploratory data analysis and correlation analysis
  • Build a logistic regression model to detect spam messages
  • Construct clusters of pictures using the K-means algorithm and identify popular personalities and destinations
  • Develop recommendation systems using Collaborative Filtering and the Apriori algorithm

In Detail

With an increase in the number of users on the web, the content generated has increased substantially, bringing in the need to gain insights into the untapped gold mine that is social media data. For computational statistics, R has an advantage over other languages in providing readily-available data extraction and transformation packages, making it easier to carry out your ETL tasks. Along with this, its data visualization packages help users get a better understanding of the underlying data distributions while its range of "standard" statistical packages simplify analysis of the data.

This book will teach you how powerful business cases are solved by applying machine learning techniques on social media data. You will learn about important and recent developments in the field of social media, along with a few advanced topics such as Open Authorization (OAuth). Through practical examples, you will access data from R using APIs of various social media sites such as Twitter, Facebook, Instagram, GitHub, Foursquare, LinkedIn, Blogger, and other networks. We will provide you with detailed explanations on the implementation of various use cases using R programming.

With this handy guide, you will be ready to embark on your journey as an independent social media analyst.

Style and approach

This easy-to-follow guide is packed with hands-on, step-by-step examples that will enable you to convert your real-world social media data into useful, practical information.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Mastering Social Media Mining with R
    1. Table of Contents
    2. Mastering Social Media Mining with R
    3. Credits
    4. About the Authors
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Fundamentals of Mining
      1. Social media and its importance
      2. Various social media platforms
      3. Social media mining
      4. Challenges for social media mining
      5. Social media mining techniques
        1. Graph mining
        2. Text mining
      6. The generic process of social media mining
        1. Getting authentication from the social website – OAuth 2.0
          1. Differences between OAuth and OAuth 2.0
        2. Data visualization R packages
          1. The simple word cloud
          2. Sentiment analysis Wordcloud
      7. Preprocessing and cleaning in R
      8. Data modeling – the application of mining algorithms
        1. Opinion mining (sentiment analysis)
        2. Steps for sentiment analysis
          1. Community detection via clustering
      9. Result visualization
      10. An example of social media mining
      11. Summary
    9. 2. Mining Opinions, Exploring Trends, and More with Twitter
      1. Twitter and its importance
      2. Understanding Twitter's APIs
        1. Twitter vocabulary
      3. Creating a Twitter API connection
        1. Creating a new app
        2. Finding trending topics
        3. Searching tweets
      4. Twitter sentiment analysis
        1. Collecting tweets as a corpus
        2. Cleaning the corpus
        3. Estimating sentiment (A)
        4. Estimating sentiment (B)
      5. Summary
    10. 3. Find Friends on Facebook
      1. Creating an app on the Facebook platform
      2. Rfacebook package installation and authentication
        1. Installation
        2. A closer look at how the package works
      3. A basic analysis of your network
      4. Network analysis and visualization
        1. Social network analysis
        2. Degree
        3. Betweenness
        4. Closeness
        5. Cluster
        6. Communities
      5. Getting Facebook page data
      6. Trending topics
        1. Trend analysis
      7. Influencers
        1. Based on a single post
        2. Based on multiple posts
      8. Measuring CTR performance for a page
      9. Spam detection
        1. Implementing a spam detection algorithm
      10. The order of stories on a user's home page
      11. Recommendations to friends
        1. Reading the output
      12. Other business cases
      13. Summary
    11. 4. Finding Popular Photos on Instagram
      1. Creating an app on the Instagram platform
      2. Installation and authentication of the instaR package
      3. Accessing data from R
        1. Searching public media for a specific hashtag
        2. Searching public media from a specific location
        3. Extracting public media of a user
        4. Extracting user profile
        5. Getting followers
        6. Who does the user follow?
        7. Getting comments
        8. Number of times hashtag is used
      4. Building a dataset
        1. User profile
        2. User media
        3. Travel-related media
        4. Who do they follow?
      5. Popular personalities
        1. Who has the most followers?
        2. Who follows more people?
        3. Who shared most media?
        4. Overall top users
        5. Most viral media
      6. Finding the most popular destination
        1. Locations
        2. Locations with most likes
        3. Locations most talked about
        4. What are people saying about these locations?
        5. Most repeating locations
      7. Clustering the pictures
      8. Recommendations to the users
        1. How to do it
        2. Top three recommendations
        3. Improvements to the recommendation system
      9. Business case
      10. Reference
      11. Summary
    12. 5. Let's Build Software with GitHub
      1. Creating an app on GitHub
      2. GitHub package installation and authentication
      3. Accessing GitHub data from R
      4. Building a heterogeneous dataset using the most active users
        1. Data processing
      5. Building additional metrics
      6. Exploratory data analysis
      7. EDA – graphical analysis
        1. Which language is most popular among the active GitHub users?
        2. What is the distribution of watchers, forks, and issues in GitHub?
        3. How many repositories had issues?
        4. What is the trend on updating repositories?
        5. Compare users through heat map
      8. EDA – correlation analysis
        1. How Watchers is related to Forks
        2. Correlation with regression line
        3. Correlation with local regression curve
        4. Correlation on segmented data
        5. Correlation between the languages that user's use to code
        6. How to get the trend of correlation?
        7. Reference
      9. Business cases
      10. Summary
    13. 6. More Social Media Websites
      1. Searching on social media
      2. Accessing product reviews from sites
      3. Retrieving data from Wikipedia
      4. Using the Tumblr API
      5. Accessing data from Quora
      6. Mapping solutions using Google Maps
      7. Professional network data from LinkedIn
      8. Getting Blogger data
      9. Retrieving venue data from Foursquare
        1. Use cases
      10. Yelp and other networks
        1. Limitations
      11. Summary
    14. Index