May 2018
Intermediate to advanced
576 pages
30h 25m
English
If you have multi-column indexes (or joins), the optimizer will assume that the column values are independent of each other, which can lead to mis-estimation in cases where there is a correlation between the values.
If you have two dependent columns, such as state and area_code, then you can define additional statistics that will be collected when you next ANALYZE the table:
CREATE STATISTICS cust_s1 (ndistinct, dependencies) ON state, area_code FROM cust;
Execution time of ANALYZE will increase to collect the additional stats information, plus there is a small increase in query planning time so use this sparingly when you can confirm this will make a difference. You don’t need to have both ndistinct and dependencies in all ...