February 2019
Intermediate to advanced
672 pages
16h 50m
English
Joins are useful to aggregate data that is scattered among different tables. Let’s say that we want to include the location of the hospital in which patient measurements were taken in our dataset. We can reference the location for each patient using the H1, H2, and H3 labels, and we can store the address and identifier of the hospital in a hospital table:
hospitals = pd.DataFrame( { "name" : ["City 1", "City 2", "City 3"], "address" : ["Address 1", "Address 2", "Address 3"], "city": ["City 1", "City 2", "City 3"] }, index=["H1", "H2", "H3"]) hospital_id = ["H1", "H2", "H2", "H3", "H3", "H3"] df['hospital_id'] = hospital_id
Now, we want to find the city where the measure was taken for each patient. We need to map the keys from the ...