Demographics and data science

Social networks exist for and by its user base. StackExchange rides upon its wide user base which has a diverse set of skills. In this use case, let us try and understand the demographic related dynamics of https://datascience.stackexchange.com/.

We first begin with loading the user related data from the dumps. As discussed earlier, this information is available in the Users.XML file. We utilize the same loadXMLToDataFrame utility function to get the required DataFrame. We then get some quick details from the DataFrame such as number of users, average age, average reputation, and so on. The following snippet gets us started on the same:

# Total Users > dim(UsersDF) [1] 19237 14 # Average Reputation Score > max(as.numeric(UsersDF[!is.na(UsersDF$Reputation),'Reputation'])) ...

Get Learning Social Media Analytics with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.