How it works...
After importing the data and identifying the three entities, we must create a unique identifier for each observation so that we can link to the movies, actors and directors together once they have been separated into different tables. In step 2, we simply set the ID column as the row number beginning from zero. In step 3, we use the wide_to_long function to simultaneously melt the actor and director columns. It uses the integer suffix of the columns to align the data vertically and places this integer suffix in the index. The parameter j is used to control its name. The values in the columns not in the stubnames list repeat to align with the columns that were melted.
In step 4, we create our three new tables, keeping the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access