April 2026
461 pages
17h 56m
English
Transformer neural networks are a relatively young but revolutionary architecture in the field of AI and ML. Originally presented in the paper “Attention Is All You Need” by Ashish Vaswani et al. in 2017, transformer networks have quickly gained popularity and are now a central component of many advanced systems or large language models (LLMs), especially those for natural language processing (NLP).
In contrast to traditional recurrent neural networks (RNNs), which process sequential data step-by-step and usually only from left to right, transformers work with a mechanism referred to as self-attention. This mechanism enables the model to consider and weigh different parts of an input sequence (i.e., a sentence ...
Read now
Unlock full access