November 2017
Beginner to intermediate
290 pages
7h 34m
English
In addition to partitioning, there are other aspects of how an Apex application is deployed on the cluster that are under the control of the application writer which can be used to further improve performance.
For example, consider consecutive operators opA and opB in a DAG. If the former generates a large volume of data on its output port and the latter performs some sort of filtering or aggregation operation so that the volume of data leaving its output port is considerably diminished, it may make sense to co-locate them in the same node to conserve network bandwidth; this is called NODE_LOCAL locality.
Additionally, tuple serialization and deserialization overhead (which can be considerable in some cases) can ...
Read now
Unlock full access