Which Options Should You Choose?
With all of these partitioning and streaming choices, which one makes sense for your particular situation?
- PARTITION BY ANY (random)
Use this option if you just want extra throughput and the ordering of records is not important at all. For example, if you need to perform calculations on every record from the REF cursor and each calculation is independent, then this is the way to go because Oracle will distribute the records without worrying about their order.
I have yet to come across a situation requiring random partitioning in conjunction with the ORDER or CLUSTER options, but these combinations are syntactically allowed.
- PARTITION BY RANGE
Specify this option if your function relies on certain records being processed together and the records are spread evenly among the values. This will ensure that each parallel function instance gets a similar amount of work to do.
Combining this partitioning option with either ORDER BY or CLUSTER BY is valid. Note that the clustering approach will be slightly faster.
- PARTITION BY HASH
Specify this option if your function relies on certain records being processed together and the spread is somewhat skewed. The hash algorithm provides a better chance of each parallel function instance’s getting a similar amount of work to do.
Combining this partitioning option with either ORDER BY or CLUSTER BY is valid. Note that the clustering approach will be slightly faster.