A neural network architecture built from (self-) attention layers plus position-wise feed-forward layers.

Often used as the backbone of modern LLMs.