Skip to Main Content
Data Science Fundamentals with R, Python, and Open Data
book

Data Science Fundamentals with R, Python, and Open Data

by Marco Cremonini
April 2024
Beginner to intermediate content levelBeginner to intermediate
480 pages
12h 22m
English
Wiley
Content preview from Data Science Fundamentals with R, Python, and Open Data

4Subsetting with Logical Conditions

In this chapter, we introduce logical conditions and the main logical operators. These represent the key elements for selection operations based on logical criteria, not just a simple list of items or selection helpers. But most of all, we turn our attention from columns to rows of a data frame, and we have to assume that we may need to extract a subset of rows from thousands (still small datasets) or easily even from hundred thousand or millions of rows (already large datasets); therefore, as a general rule of thumb, no manual approach based on scrolling through data and listing rows is suitable. It is through the definition of logical conditions and their combination that we could express elaborated criteria to extract subsets of rows from real datasets.

To be more specific, turning our attention from columns to rows is not meant to say that logical conditions only apply to row selection. Selection based on logical conditions applies equally to rows and columns; however, speaking of open data and real data in general, there is typically a difference of many orders of magnitude in scale between rows and columns, and the scale of dataset sizes is not just a technicality; it is a characteristic deeply ingrained in data science, a pillar of both R and Python environments, which have been developed and present continuous innovations and improvements to deal with content and meaning of data and also with their ever-increasing scale. Therefore, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python and R for the Modern Data Scientist

Python and R for the Modern Data Scientist

Rick J. Scavetta, Boyan Angelov

Publisher Resources

ISBN: 9781394213245Purchase Link