Ryan Hafen, Tara Gibson, Kerstin Kleese van Dam and Terence Critchlow, Pacific Northwest National Laboratory, Richland, Washington, USA
In this chapter, we use the R and Hadoop Integrated Programming Environment (RHIPE) as a flexible, scalable environment for analyzing multiterabyte data sets being produced by a phasor measurement unit sensor network on the electrical power grid. RHIPE enables exploratory data analysis on the entire data set, allowing us to develop both data cleaning and event classification methods that reflect event characteristics as represented by the actual data instead of relying on theoretical models. We describe several of the data cleaning filters that we ...
No credit card required