© Ramcharan Kakarla, Sundar Krishnan and Sridhar Alla 2021
R. Kakarla et al.Applied Data Science Using PySparkhttps://doi.org/10.1007/978-1-4842-6500-0_6

6. Model Evaluation

Ramcharan Kakarla1  , Sundar Krishnan1 and Sridhar Alla2
(1)
Philadelphia, PA, USA
(2)
New Jersey, NJ, USA
 

“All models are wrong, but some are useful”

— George E.P. Box.

Many people try to develop models to perform a certain task (for example, predicting house prices). Often times, these models cannot represent 100 percent of reality. In our example, we cannot exactly predict a house price all the time. However, it does not mean that our model is garbage. In general, all statistical and machine learning models face this problem. Then, why build one in the first place? Even though ...

Get Applied Data Science Using PySpark: Learn the End-to-End Predictive Model-Building Cycle now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.