An atomic unit of input data to LLMs. In today’s models tokens are typically subword (e.g. a short word or a chunk of a longer word), but they can also be byte-level (e.g. individual characters).

Tokens are generated in sequences by passing input text through a tokenizer.