Imagine the following conversation:
Person A: I can't find my print of The Starry Night. Do you know where it is?
Person B: What does it look like?
For a computer, or for someone who is naive about Western art, Person B's question is quite reasonable. Before we can use our sense of sight (or other senses) to track something, we need to have sensed that thing before. Failing that, we at least need a good description of what we will sense. For computer vision, we must provide a reference image that will be compared with the live camera image or scene. If the target has complex geometry or moving parts, we might need to provide many reference images to account for different perspectives and poses. However, for our examples ...