Python Utilities Functions
This appendix includes several Python utilities that are used throughout the book. The utilities were designed to simplify the use and improve the display of several often-used data mining methods. They are combined in the dmba package, that is available from the Python Package Index at https://pypi.org/project/dmba. Install the package using:
pip install dmba
The source code is available and maintained on https://github.com/gedeck/dmba.
regressionSummary
import math import numpy as np from sklearn.metrics import regression def regressionSummary(y_true, y_pred): """ print regression performance metrics Input: y_true: actual values y_pred: predicted values """ y_true = np.asarray(y_true) y_pred = np.asarray(y_pred) y_res = y_true - y_pred metrics = [ ('Mean Error (ME)', sum(y_res) / len(y_res)), ('Root Mean Squared Error (RMSE)', math.sqrt(regression.mean_squared_error(y_true, y_pred))), ('Mean Absolute Error (MAE)', sum(abs(y_res)) / len(y_res)), ('Mean Percentage Error (MPE)', 100 * sum(y_res / y_true) / len(y_res)), ('Mean Absolute Percentage Error (MAPE)', 100 * sum(abs(y_res / y_true) / len(y_res))), ] fmt1 = '{{:>{}}} : {{:.4f}}'.format(max(len(m[0]) for m in metrics)) print('\nRegression statistics\n') for metric, value in metrics: print(fmt1.format(metric, value))
classificationSummary
from sklearn.metrics import classification def classificationSummary(y_true, ...
Get Data Mining for Business Analytics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.