Sampling Temperature
Definition
A parameter that controls the randomness of a language model's output by scaling the probability distribution over tokens.
Temperature is a value (typically between 0 and 2) that adjusts how the model selects the next token. A temperature of 0 always picks the highest-probability token (deterministic output). Higher temperatures flatten the probability distribution, making less likely tokens more probable and producing more varied, creative output.
For text refinement, lower temperatures (0.1-0.3) are generally preferred because they produce more predictable, consistent output that faithfully preserves the original meaning. Higher temperatures might introduce unwanted variations or hallucinations. The optimal temperature depends on the task — creative writing benefits from higher values, while grammar correction benefits from lower values.