To solve these problems, the authors of the paper propose a new type of network building block, called a capsule, instead of the neuron. The output of a neuron is a scalar value. In contrast, the output of a capsule is a vector (a list of values), which consists of the following:

  • The elements of the vector represent the pose and other properties of the object.
  • The length of the vector is in the (0, 1) range and represents the probability of detecting the feature at that location. As a reminder, the length of a vector is , where vi are the vector elements.

Let's consider a capsule, which detects faces. If we start moving a face across ...

Get Python Deep Learning - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.