Greedy Decoding

Definition

A decoding strategy that selects the single most probable token at each time step.

Greedy decoding is the simplest decoding strategy for sequence generation. At each step, the model selects the token with the highest probability and moves on. This makes it very fast — there is no need to maintain multiple hypotheses or perform backtracking.

The downside is that greedy decoding can miss globally optimal sequences. A locally optimal choice at one step may lead to a poor overall result. For this reason, beam search is preferred when accuracy matters more than speed, though greedy decoding remains popular for real-time and on-device applications where latency is critical.

Related Terms

Related Content