P(i) = exp(si / T)  /  ∑j exp(sj / T)
Boltzmann–Gibbs form — score si ≡ −εi/kB  —  As T→0: mass concentrates on argmax si.   As T→∞: P → Uniform(1/n).
Scores / logits si
Probability distribution P(i)
Temperature T — log scale
1.00 standard softmax
T = 0.01  peaked 0.1 T = 1  standard 10 T = 100  uniform
Normalized entropy
H(p) / log n  —  0 peaked, 1 uniform
Max probability
P(argmax si)
Perplexity
exp(H) — effective # of states

© 2026 Theodore P. Pavlic — MIT License