Appendix A. Datasets

All datasets are stored under src/main/resources/datasets. While Java class codes are stored under src/main/java, user resources are stored under src/main/resources. In general, we use the JAR loader functionality to retrieve contents of a file directly from the JAR, not from the filesystem.

Anscombe’s Quartet

Anscombe’s quartet is a set of four x-y pairs of data with remarkable properties. Although the x-y plots of each pair look completely different, the data has the properties that make statistical measures almost identical. The values for each of the four x-y data series are in Table A-1.

Table A-1. Anscombe’s quartet data
x1y1x2y2x3y3x4y4
10.08.0410.09.1410.07.468.06.58
8.06.958.08.148.06.778.05.76
13.07.5813.08.7413.012.748.07.71
9.08.819.08.779.07.118.08.84
11.08.3311.09.2611.07.818.08.47
14.09.9614.08.1014.08.848.07.04
6.07.246.06.136.06.088.05.25
4.04.264.03.104.05.3919.012.50
12.010.8412.09.1312.08.158.05.56
7.04.827.07.267.06.428.07.91
5.05.685.04.745.05.738.06.89

We can easily hardcode the data as static members of the class:

public class Anscombe {
    public static final double[] x1 = {10.0, 8.0, 13.0, 9.0, 11.0,
                                       14.0, 6.0, 4.0, 12.0, 7.0, 5.0};
    public static final double[] y1 = {8.04, 6.95, 7.58, 8.81, 8.33,
                                       9.96, 7.24, 4.26, 10.84, 4.82, 5.68};
    public static final double[] x2 = {10.0, 8.0, 13.0, 9.0, 11.0,
                                       14.0, 6.0, 4.0, 12.0, 7.0, 5.0};
    public static final double[] y2 = {9.14, 8.14, 8.74, 8.77, 9.26,
                                       8.10, 6.13, 3.10, 9.13, 7.26, 4.74};
    public static final double ...

Get Data Science with Java now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.