Here is a commented example of a WHERE clause that enables a SAS SPD Server star
join optimization:
where
/* dimension1 equi-joined on the fact */
hh_&statesimple.geosur = hh_dim_geo_&statesimple.geosur
/* dimension2 equi-joined on the fact */
and hh_&statesimple.utilsur = hh_dim_utility_&statesimple.utilsur
/* dimension3 equi-joined on the fact */
and hh_dim_family_&statesimple.famsur =
hh_dim_family_&statesimple.famsur
/* subsetting condition on the fact */
and hh_dim_family_&statesimple.PERSONS = 1
;
Note: The SAS SPD Server requires all subsetting to be implemented on the Where tab
in the SQL Join transformation. For more information about SAS SPD Server
support for star joins, see the SAS Scalable Performance Data Server: User's Guide.
When the code is properly configured, the following output is generated in the log:
SPDS_NOTE: STARJOIN optimization used in SQL execution.
Optimizing SQL Processing Performance
Problem
Joins are a common and resource-intensive part of SAS Data Integration Studio. SAS
SQL implements several well-known join algorithms: sort-merge, index, and hash. You
can use common techniques to aid join performance, irrespective of the algorithm that
you choose. Conditions often cause the SAS SQL optimizer to choose the sort-merge
algorithm; techniques that improve sort performance also improve sort-merge join
performance. However, understanding and leveraging index and hash joins enhance
performance.
You might often perform lookups between tables in SAS Data Integration Studio. Based
on key values in one table, you look up matching keys in a second table and retrieve
associated data in the second table. SQL joins can perform lookups. However, SAS and
SAS Data Integration Studio provide special lookup mechanisms that typically
outperform a join. The problems associated with joins are similar to the problems with
sorting:
Join performance seems slow.
You have trouble influencing the join algorithm that SAS SQL chooses.
You experience higher than expected disk space consumption.
You have trouble operating SAS SQL joins with RDBMS data.
Solution
Review the techniques explained in the following topics:
“Debugging an SQL Query” on page 453
“Enabling Explicit Pass-Through Processing for SQL Join Transformations” on page
484
Optimizing SQL Processing Performance 479

Get SAS Data Integration Studio 4.9 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.