Softmax Temperature Explorer — interactive explainer

P(i) = exp(s_i / T) / ∑_j exp(s_j / T)

Boltzmann–Gibbs form · logit s_i = −E_i · T→0: argmax s_i · T→∞: uniform

Logit view Energy view

Scores / logits s_i

⚖️

Probability distribution P(i)

Temperature T — log scale

1.00 standard softmax

T = 0.01 peaked 0.1 T = 1 standard 10 T = 100 uniform

Continuous Boltzmann landscape · exp(−E / T) = exp(s / T)

The curve shows the Boltzmann weight exp(−E / T) = exp(s / T), normalized so the lowest-energy state shown sits at 1. Dots mark each discrete state at E_i = −s_i. Actual P(i) requires dividing by the partition function Z = ∑_j exp(−E_j / T).

Normalized entropy

—

H(p) / log n · 0 peaked, 1 uniform

Max probability

—

P(argmax s_i)

Perplexity

—

exp(H) · effective # of states