In this example, we are going to employ the Airfoil Self-Noise dataset freely provided by Brooks, Pope, Marcolini, and Lopez through the UCI website (for download information, please read the infobox at the end of the section). The dataset is made up of 1,503 samples (xi), with five parameters describing the wind tunnel configuration and a dependent variable (yi) that represents the scaled sound pressure (in dB). In this case, we want to train a regressor with 1,203 samples and test it using the remaining ones. The dataset is stored in a single TSV file; hence, we can easily load and inspect it using pandas:
import pandas as pdfile_path = '<DATA_PATH>\airfoil_self_noise.dat'df = pd.read_csv(file_path, ...