Merging and Joining Data
Many of the examples we see, both in this book and in many spaces where we learn to program in R or other languages, include complete datasets. The datasets we're using for much of this book are built-in and don't need to be merged with any other data. This is very rarely the case when you're actually doing data analysis. One crucial skill in data science especially is the ability to merge and join data, by a common key, from occasionally disparate sources. Base R allows for merging datasets with the merge function. Inside of it, you can specify the type of merge, which you might be familiar with if you've ever used SQL to merge data. Joins are implemented in R inside the dplyr package. Say we have two datasets, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access