Feature engineering with Spark

Machine learning based on big data is a deep and broad area and it needs a new recipe and the ingredients would be feature engineering and stable optimization of the model out of the data. The optimized model can be called Big Models (see also at S. Martinez, A. Chen, G. I. Webb, and N. A. Zaidi, Scalable learning of Bayesian network classifiers, accepted to be published in Journal of Machine Learning Research) that can learn from big data and holds the key to a breakthrough other than big data.

Big model also signifies that your results out of diverse and complex big data would be with low bias (see at D. Brain and G. I. Webb, The need for low bias algorithms in classification learning from small data sets, in PKDD ...

Get Large Scale Machine Learning with Spark now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.