April 2018
Beginner
238 pages
7h 13m
English
We can use a script like this:
import pysparkif not 'sc' in globals(): sc = pyspark.SparkContext()#check if a number is primedef isprime(n): # must be positive n = abs(int(n)) # 2 or more if n < 2: return False # 2 is the only even prime number if n == 2: return True if not n & 1: return False # range starts with 3 and only needs to go up the square root of n # for all odd numbers for x in range(3, int(n**0.5)+1, 2): if n % x == 0: return False return True nums = sc.parallelize(range(1000000))# Compute the number of primes in the RDDprint(nums.filter(isprime).count())
Producing the output:

As you can see in the isprime