So far, we have seen our deep learning models running on the desktop, the cloud, and the browser. Although there are definite upsides to such a setup, it might not be ideal for all scenarios. In this chapter, we explore making predictions using deep learning models on mobile devices.
Bringing the computation closer to the user’s device, rather than a distant remote server, can be advantageous for many reasons:
Sending an image, processing it in the cloud, and returning the result can take several seconds depending on the network quality and quantity of data being transferred. This can make for a poor UX. Decades of UX research, including Jakob Nielsen’s findings in 1993, published in his book Usability Engineering (Elsevier), showed the following:
0.1 second is about the limit for having the user feel that the system is reacting instantaneously.
1 second is about the limit for the user’s flow of thought to stay uninterrupted.
10 seconds is about the limit for keeping the user’s attention focused.
About two decades later, Google published findings that half of all mobile browser users abandon a web page if it takes longer than three seconds to load. Forget three seconds, even a 100 ms increase in latency would result in a 1% decrease in sales for Amazon. That is a lot of lost revenue. Mitigating this by processing on the device instantaneously can make for rich and interactive UXs. ...