Here are a list of questions for your reference:
- What do you understand by EDA? Why is it important?
- Why do we create training and test data?
- Why did we index the data that we pulled from the UCI Machine Learning Repository?
- Why is the Iris dataset so famous?
- Name one powerful feature of the random forest classifier.
- What is supervisory as opposed to unsupervised learning?
- Explain briefly the process of creating our model with training data.
- What are feature variables in relation to the Iris dataset?
- What is the entry point to programming with Spark?
Task: The Iris dataset problem was a statistical classification problem. Create a confusion or error matrix with the rows being predicted setosa, predicted versicolor, and predicted ...