December 2018
Beginner to intermediate
682 pages
18h 1m
English
After importing the data and identifying the three entities, we must create a unique identifier for each observation so that we can link to the movies, actors and directors together once they have been separated into different tables. In step 2, we simply set the ID column as the row number beginning from zero. In step 3, we use the wide_to_long function to simultaneously melt the actor and director columns. It uses the integer suffix of the columns to align the data vertically and places this integer suffix in the index. The parameter j is used to control its name. The values in the columns not in the stubnames list repeat to align with the columns that were melted.
In step 4, we create our three new tables, keeping the ...