Efficiency with TorchScript
We have set up the simple Flask application server to serve our model and we have implemented the same model using the MXNet model server, but if we need to go away from the Python world and make a highly efficient server in C++ or Go, or in other efficient languages, PyTorch came up with TorchScript, which can generate the most efficient form your model, which is readable in C++.
Now the question is: isn't this what we did with ONNX; that is, creating another IR from the PyTorch model? Yes, the processes are similar, but the difference here is that ONNX creates the optimized IR using tracing; that is, it passes a dummy input through the model and while the model is being executed, it records the PyTorch operation ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access