When machine learning is deployed, new data should go through the same data preprocessing procedures (scaling, feature engineering, feature selection, dimensionality reduction, and so on) as in previous stages. The preprocessed data is then fed in the trained model. We simply cannot rerun the entire process and retrain the model every time new data comes in. Instead, we should save the established preprocessing models and trained prediction models after corresponding stages have been completed. In deployment mode, these models are loaded in advance, and are used to produce the prediction results of the new data.
We illustrate it via the diabetes example where we standardize the data and ...