Video question answering system

The following example will be focusing on building a video question answering model, and we will be using Keras to define the model.

In order to solve this problem, we will retrain it using high-level TensorFlow training in a distributed setting.

Figure 6: Video Question Answering

As we can see that we have videos which are sampled 4 frames per second and it's roughly 10 seconds per video so we have about 40 frames total per video. And we are asking questions about the video contents, just like the ones that are shown in figure 6.

So we are going to build a deep learning model that will take as an input:

Get Deep Learning with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.