November 2018
Intermediate to advanced
556 pages
14h 42m
English
We know nothing about NASA's data, nor how NASA operate. We only have a lot of data. For this reason, we choose to use a data-driven approach. In this simplified exercise, we will use a statistical method that's been applied to a moving average algorithm. The algorithm evaluates the standard deviation (std) in a window of 50 points and checks the data points that exceed 2*std:
def moving_average(data, window_size): window = np.ones(int(window_size))/float(window_size) return np.convolve(data, window, 'same')def search_anomalies(y, window_size, sigma=1.0): avg = moving_average(y, window_size).tolist() residual = y - avg # Calculate the variation in the distribution of the residual std = np.std(residual) anomalies=[] ...