For column renaming and selection, we define a dictionary column_mapping where
each entry defines a mapping from the current column name to a new name. Col‐
umns mapped to
None and unmentioned columns are dropped. A dictionary is per‐
fect documentation for such a transformation and easy to reuse. This dictionary is
then used to select and rename the columns that we want to keep.
column_mapping = {
'id': 'id',
'subreddit': 'subreddit',
'title': 'title',
'selftext': 'text',
'category_1': 'category',
'category_2': 'subcategory',
'category_3': None, # no data
'in_data': None, # not needed
'reason_for_exclusion': None # not needed ...