May 2018
Intermediate to advanced
576 pages
30h 25m
English
In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all of this to understand why:
postgres=# SELECT count(*) FROM mybigtable; count------- 10000(1 row)postgres=# SELECT count(*) FROM mybigtable TABLESAMPLE BERNOULLI(1); count------- 106(1 row)postgres=# SELECT count(*) FROM mybigtable TABLESAMPLE BERNOULLI(1); count------- 99(1 row)