April 2018
Beginner to intermediate
282 pages
6h 52m
English
What about visually inspecting datasets that have more than three dimensions? In order to visually inspect your dataset, you need to have a maximum of three dimensions; if not, you need to use specific methods to reduce dimensionality. This is usually achieved by applying a Principal Component Analysis (PCA) or t-SNE algorithm.
The following code will load the Breast Cancer Wisconsin Diagnostic dataset, which is commonly used in ML tutorials:
# Wisconsin Breast Cancer Diagnostic Datasetfrom sklearn.datasets import load_breast_cancerimport pandas as pddata = load_breast_cancer()X = data.datadf = pd.DataFrame(data.data, columns=data.feature_names)df.head()
Output in the console is as follows:
mean radius ...