How to do it...

In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all of this to understand why:

  1. We first consider using SQL to derive a sample. Random sampling is actually very simple because we can use either the TABLESAMPLE clause, on PostgreSQL 9.5 and later, or the random() SQL function within the WHERE clause, on older releases. Consider the following example:
        postgres=# SELECT count(*) FROM mybigtable;         count        -------         10000        (1 row)        postgres=# SELECT count(*) FROM mybigtable                                  TABLESAMPLE BERNOULLI(1);         count ------- ...

Get PostgreSQL Administration Cookbook, 9.5/9.6 Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.