R package maintainers
Another similarly straightforward data source might be the list of R package maintainers. We can download the names and e-mail addresses of the package maintainers from a public page of CRAN, where this data is stored in a nicely structured HTML table that is extremely easy to parse:
> packages <- readHTMLTable(paste0('http://cran.r-project.org', + '/web/checks/check_summary.html'), which = 2)
Extracting the names from the Maintainer
column can be done via some quick data cleansing and transformations, mainly using regular expressions. Please note that the column name starts with a space—that's why we quoted the column name:
> maintainers <- sub('(.*) <(.*)>', '\\1', packages$' Maintainer') > maintainers <- gsub(' ', ' ', ...
Get Mastering Data Analysis with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.