## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

# Chapter 1: Introduction to Artificial Neural Networks

1. Here is a neural network based on the original artificial neurons that computes AB (where ⊕ represents the exclusive OR), using the fact that AB = (A ∧ ¬ B) ∨ (¬ AB). There are other solutions—for example, using the fact that AB = (AB) ∧ ¬(AB), or the fact that AB = (AB) ∧ (¬ A ∨ ∧ B), and so on.

2. A classical Perceptron will converge only if the dataset is linearly separable, and it won’t be able to estimate class probabilities. In contrast, a Logistic Regression classifier will converge to a good solution even if the dataset is not linearly separable, and it will output class probabilities. If you change the Perceptron’s activation function to the logistic activation function (or the softmax activation function if there are multiple neurons), and if you train it using Gradient Descent (or some other optimization algorithm minimizing the cost function, typically cross entropy), then it becomes equivalent to a Logistic Regression classifier.

3. The logistic activation function was a key ingredient in training the first MLPs because its derivative is always nonzero, so Gradient Descent can always roll down the slope. When the activation function is a step function, Gradient Descent cannot move, as there is no slope at all.

4. The step function, the logistic function, ...

## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

No credit card required