O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

L1 regularization

We can apply the same approach for differing levels of L1 regularization, as follows:

params = [0.0, 0.01, 0.1, 1.0, 10.0, 100.0, 1000.0] metrics = [evaluate(train_data, test_data, 10, 0.1, param, 'l1',    False) for param in params] print params print metrics plot(params, metrics) fig = matplotlib.pyplot.gcf() pyplot.xscale('log') 

Again, the results are more clearly seen when plotted in the following graph. We see that there is a much more subtle decline in RMSLE, and it takes a very high value to cause a jump back up. Here, the level of L1 regularization required is much higher than that for the L2 form; however, the overall performance is poorer:

[0.0, 0.01, 0.1, 1.0, 10.0, 100.0, 1000.0][1.5384660954019971, 1.5384518080419873, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required