Skip to Content
Data Visualization in R and Python
book

Data Visualization in R and Python

by Marco Cremonini
December 2024
Intermediate to advanced
576 pages
12h 58m
English
Wiley
Content preview from Data Visualization in R and Python

4Histograms and Kernel Density Plots

A histogram is a traditional type of graphics based on a continuous variable. For the values of this variable, it defines a certain number of ranges called bins and counts the number of observations for each bin. Visually, it is schematic and typically aesthetically simple, but it may provide useful information about data. For this reason, it is often used as an analysis tool, not just in presentations, in order to study general characteristics of data, such as anomalous distributions. It is important to remember that histograms are most useful when several combinations of bin width or numerosity are tested.

Dataset

In this section, we use the dataset Compiled historical daily temperature and precipitation data for selected 210 U.S. cities, Yuchuan Lai and David Dzombak, Carnegie Mellon University and Report qualità aria 2021 (transl. Air Quality Report year 2021), Open Data Municipality of Milan, already introduced before. The following one is new, instead.

Bologna – B&B List, Open Data from Bologna Municipality, Italy (https://opendata.comune.bologna.it/explore/dataset/bologna-rilevazione-airbnb/information/?disjunctive.neighbourhood&disjunctive.room_type),

Copyright: Creative Commons CC BY-4.0.

4.1 R: ggplot

The main ggplot function for histograms is geom_histogram() with two main attributes, to be used as alternatives:

  • binwidth defines the width of bins; in this case, the number of bins is derived from the whole range of values ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Data Visualization with Python and JavaScript, 2nd Edition

Data Visualization with Python and JavaScript, 2nd Edition

Kyran Dale
Hands-On Data Visualization

Hands-On Data Visualization

Jack Dougherty, Ilya Ilyankou

Publisher Resources

ISBN: 9781394289486Purchase Link