O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Increasing performance

Inference time depends on the Floating-Point Operations Per Second (FLOPS) required to run a model with hardware. The FLOPS is influenced by the number of model parameters and floating-point operations involved. The floating-point operations are mostly matrix operations, such as addition, products, and division. For example, a convolution operation has a few parameters representing the kernel, but takes longer to compute, as the operation has to be performed across the input matrix. In the case of a fully connected layer, the parameters are huge, but run quickly.

The weights of the model are usually double or high precision floating-point values, and an arithmetic operation on such numbers is more expensive than performing ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required