To solve these problems, the authors of the paper propose a new type of network building block, called a capsule, instead of the neuron. The output of a neuron is a scalar value. In contrast, the output of a capsule is a vector (a list of values), which consists of the following:

  • The elements of the vector represent the pose and other properties of the object.
  • The length of the vector is in the (0, 1) range and represents the probability of detecting the feature at that location. As a reminder, the length of a vector is , where vi are the vector elements.

Let's consider a capsule, which detects faces. If we start moving a face across ...

Get Python Deep Learning - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.