Dealing with reducible error components
High bias:
- Add more features
- Apply a more complex model
- Use less instances to train
- Reduce regularization
High variance:
- Conduct feature selection and use less features
- Get more training data
- Use regularization to help overcome the issues due to complex models
Cross validation
Cross-validation is an important step in the model validation and evaluation process. It is a technique to validate the performance of a model before we apply it on an unobserved dataset. It is not advised to use the full training data to train the model, because in such a case we would have no idea how the model is going to perform in practice. As we learnt in the previous section, a good learner should be able to generalize well on an unseen ...
Get Mastering Text Mining with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.