Introducing non-linearity

So now, we know how data enters a perceptron unit, and how associated weights are paired up with each input feature. We also know how to represent our input features, and their respective weights, as n x 1 matrices, where n is the number of input features. Lastly, we saw how we can transpose our feature matrix to be able to compute its dot product with the matrix containing its weights. This operation left us with one single scalar value. So, what's next? This is not a bad time to take a step back and ponder over what we are trying to achieve, as this will help us to understand the idea behind why we want to employ something like an activation function.

Well, you see, real-word data is often non-linear. What we mean ...

Get Hands-On Neural Networks with Keras now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.