Surprise ≡ Information ≡ Attention: Entropic Dynamics Between Prediction and Consciousness

John Rector

11 months ago

1 Shannon’s Surprisal Function

Claude Shannon defined the self-information (or surprisal) of an event E with probability p(E) as I(E)=−\log p(E). The rarer the event, the higher its logarithmic information value, and— importantly—the measure is additive across independent events, making it the natural unit of “surprise” in communication theory. [oai_citation:0‡plus.maths.org](https://plus.maths.org/content/information-surprise?utm_source=chatgpt.com) [oai_citation:1‡mbernste.github.io](https://mbernste.github.io/posts/self_info/?utm_source=chatgpt.com)

2 Prediction Error as Phenomenological Surprise

Contemporary predictive-coding models treat perception as Bayesian inference: the subconscious generates top-down predictions; bottom-up sensory data carry forward the residual prediction error. When the error exceeds the channel’s expected noise level, the system registers phenomenological surprise and updates its generative model. [oai_citation:2‡frontiersin.org](https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2012.00548/full?utm_source=chatgpt.com) [oai_citation:3‡en.wikipedia.org](https://en.wikipedia.org/wiki/Predictive_coding?utm_source=chatgpt.com)

3 Attention as Precision-Weighting of Error

Attention is not a spotlight mysteriously “chosen” by consciousness; it is the dynamic gain-control applied to error units whose signals outrun prior precision estimates. If surprise is high (large I(E)), attentional gain is up-regulated, directing cortical and autonomic resources toward resolving the discrepancy. [oai_citation:4‡pmc.ncbi.nlm.nih.gov](https://pmc.ncbi.nlm.nih.gov/articles/PMC6411367/?utm_source=chatgpt.com) [oai_citation:5‡simonsfoundation.org](https://www.simonsfoundation.org/2021/06/03/the-challenges-of-proving-predictive-coding/?utm_source=chatgpt.com)

4 Mapping onto the Reality Equation

In Love, The Cosmic Dance, Reality = Actual ∕ Expectation. Shannon’s p(E) corresponds to the subconscious prediction—the real component of the denominator. When the numerator (immutable Actual = 1) meets an unexpected denominator, I(E) spikes; this is experienced subjectively as surprise and expressed operationally as attention. Thus:

Surprise = information gain = prediction error magnitude.
Information quantifies the entropy reduction achieved by updating Expectation.
Attention is the metabolic budget allocated to restore denominator coherence after an unexpected Actual intrudes.

5 Energetic Cost and Cognitive Economy

Because information is measured in bits—units of entropy—every attentional shift has an energetic price. Systems minimise free energy by suppressing needless surprise; only errors with high −log p survive the brain’s inhibitory gating to seize consciousness. Hence attention is the scarce currency traded whenever the Past (She) and Future (He) negotiate over the form of now; surprise is the invoice, and information the ledger entry.

6 Didactic Takeaways for the Student

All three constructs reduce to a single logarithmic metric of improbability.
Surprise is the phenomenological face of information; attention is its neuro-energetic consequence.
Managing attention is therefore synonymous with managing informational entropy—shaping Expectation so that the Past can unfold with minimal metabolic cost.