We will be using the Breast Cancer dataset. The following list contains the various conventions used in the dataset:
- ID number
- Diagnosis (M = malignant, and B = benign)
- 10 real-valued features are computed for each cell nucleus:
- Radius (mean of the distances from the center to points on the perimeter)
- Texture (standard deviation of gray scale values)
- Perimeter
- Area
- Smoothness (local variation in radius lengths)
- Compactness (perimeter^2/area - 1.0)
- Concavity (severity of concave portions of the contour)
- Concave points (number of concave portions of the contour)
- Symmetry
- Fractal dimension (coastline approximation-1)
We will use random forest through Excel, applying the breast cancer dataset, to understand ...