Skip to Main Content
Data Science Fundamentals with R, Python, and Open Data
book

Data Science Fundamentals with R, Python, and Open Data

by Marco Cremonini
April 2024
Beginner to intermediate content levelBeginner to intermediate
480 pages
12h 22m
English
Wiley
Content preview from Data Science Fundamentals with R, Python, and Open Data

10Join Data Frames

The join operation between data frames is among the most important operations on data because it is not just technically powerful; it is one of the pillars of the creativity and exploration intrinsic to data science. Looking at data science as just a set of technicalities and logical or statistical skills would largely mislead the whole sense and nature of the discipline, which is to discover knowledge buried deep into data. And the act of discovering knowledge is not just a mechanistic or stochastic process; it is a creative process that requires curiosity and imagination, desire to know more and better about unfamiliar phenomena, and the ability to observe the nuances of reality, which is seldom described with an easy categorization. The join operation is so fundamental because it allows to logically combine different data frames through shared characteristics, permitting to say that an observation in one data frame could be put together with an observation of another data frame because they are both parts of a more complete observation. Like watching a scene from two different perspectives, they are different because they describe what happens from different angles, but they nevertheless describe the same scene, so they could be joined to form a more comprehensive description. This is the invaluable role of join operations.

Several other operations let you combine data frames. Usually, it is said they concatenate or bind data frames, either by columns ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python and R for the Modern Data Scientist

Python and R for the Modern Data Scientist

Rick J. Scavetta, Boyan Angelov

Publisher Resources

ISBN: 9781394213245Purchase Link