Hack #40. Return a Sample of Records

Use the Top predicate to return a portion of your records without bias.

Most often, you use a Select query to return all the records that match certain criteria. Usually, this query returns a data set that is smaller than the table or tables upon which the query is built. That is, not all records match the criteria, and the number of records that do match is smaller than the underlying set of table data.

Sometimes, you might need only a sample of records that aren't based on the criteria or in which the criteria are irrelevant. This isn't the same as fine-tuning the criteria to limit the number of returned records. For example, statistical work might require a sample from which to infer facts about the whole data set. Regardless of whether the population data is a table or a filtered data set already based on some criteria, the point is that the next step of getting a sample is completed without any preconceived notion. This is where the SQL Top predicate comes in handy.

The Top predicate works in two ways:

  • Returns a number of records

  • Returns a percentage of records

Using the Top Predicate

The Top predicate allows you to isolate records from the top of a data set. If you want to get records from the bottom, first apply a reverse sort (i.e., descending instead of ascending). Either way, you will continuously get the same set of records each time you run the query. Later in this hack, we'll discuss a method for getting a true random sample.

Figure ...

Get Access Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.