It is crucial that we serve the right data as input to the neural network architecture for training and validation. We need to make sure that data has useful scale and format and even that meaningful features are included. This will lead to more consistent and better results.
Perform the following steps for data preprocessing:
- Load the dataset using pandas
- Split the dataset into the input and output variables for machine learning
- Apply a preprocessing transform to the input variables
- Summarize the data to show the change
We use the panda's library to load data and review the shape of our dataset:
dataset = pd.read_csv('/deeplearning/google/kaggle/breast-cancer/data.csv')# get dataset detailsprint(dataset.head(5))print(dataset.columns.values) ...