Deep learning employs a stack of multiple hidden layers of non-linear processing units. The input of a hidden layer is the output of its previous layer. This can be easily observed from the examples of a shallow neural network and a deep neural network shown previously.
Features are extracted from each hidden layer. Features from different layers represent abstracts or patterns of different levels. Hence, higher-level features are derived from lower-level features, which are extracted from previous layers. All these together form a hierarchical representation learned from the data.
Take the cats and dogs image classification as an example, in traditional machine learning solutions, the classification step ...