July 2017
Intermediate to advanced
796 pages
18h 55m
English
Broadcast variables do occupy memory on all executors and depending on the size of the data contained in the broadcasted variable, this could cause resource issues at some point. There is a way to remove broadcasted variables from the memory of all executors.
Calling unpersist() on a broadcast variable removed the data of the broadcast variable from the memory cache of all executors to free up resources. If the variable is used again, then the data is retransmitted to the executors in order for it to be used again. The Driver, however, holds onto the memory as if the Driver does not have the data, then broadcast variable is no longer valid.
The following is an example ...
Read now
Unlock full access