Surprise ≡ Information ≡ Attention: Entropic Dynamics Between Prediction and Consciousness

1  Shannon’s Surprisal Function

Claude Shannon defined the self-information (or surprisal) of an event E with probability p(E) as I(E)=−\log p(E). The rarer the event, the higher its logarithmic information value, and— importantly—the measure is additive across independent events, making it the natural unit of “surprise” in communication theory. [oai_citation:0‡plus.maths.org](https://plus.maths.org/content/information-surprise?utm_source=chatgpt.com) [oai_citation:1‡mbernste.github.io](https://mbernste.github.io/posts/self_info/?utm_source=chatgpt.com)

2  Prediction Error as Phenomenological Surprise

Contemporary predictive-coding models treat perception as Bayesian inference: the subconscious generates top-down predictions; bottom-up sensory data carry forward the residual prediction error. When the error exceeds the channel’s expected noise level, the system registers phenomenological surprise and updates its generative model. [oai_citation:2‡frontiersin.org](https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2012.00548/full?utm_source=chatgpt.com) [oai_citation:3‡en.wikipedia.org](https://en.wikipedia.org/wiki/Predictive_coding?utm_source=chatgpt.com)

3  Attention as Precision-Weighting of Error

Attention is not a spotlight mysteriously “chosen” by consciousness; it is the dynamic gain-control applied to error units whose signals outrun prior precision estimates. If surprise is high (large I(E)), attentional gain is up-regulated, directing cortical and autonomic resources toward resolving the discrepancy. [oai_citation:4‡pmc.ncbi.nlm.nih.gov](https://pmc.ncbi.nlm.nih.gov/articles/PMC6411367/?utm_source=chatgpt.com) [oai_citation:5‡simonsfoundation.org](https://www.simonsfoundation.org/2021/06/03/the-challenges-of-proving-predictive-coding/?utm_source=chatgpt.com)

4  Mapping onto the Reality Equation

In Love, The Cosmic Dance, Reality = Actual ∕ Expectation. Shannon’s p(E) corresponds to the subconscious prediction—the real component of the denominator. When the numerator (immutable Actual = 1) meets an unexpected denominator, I(E) spikes; this is experienced subjectively as surprise and expressed operationally as attention. Thus:

  • Surprise = information gain = prediction error magnitude.
  • Information quantifies the entropy reduction achieved by updating Expectation.
  • Attention is the metabolic budget allocated to restore denominator coherence after an unexpected Actual intrudes.

5  Energetic Cost and Cognitive Economy

Because information is measured in bits—units of entropy—every attentional shift has an energetic price. Systems minimise free energy by suppressing needless surprise; only errors with high −log p survive the brain’s inhibitory gating to seize consciousness. Hence attention is the scarce currency traded whenever the Past (She) and Future (He) negotiate over the form of now; surprise is the invoice, and information the ledger entry.

6  Didactic Takeaways for the Student

  1. All three constructs reduce to a single logarithmic metric of improbability.
  2. Surprise is the phenomenological face of information; attention is its neuro-energetic consequence.
  3. Managing attention is therefore synonymous with managing informational entropy—shaping Expectation so that the Past can unfold with minimal metabolic cost.

Author: John Rector

Co-founded E2open with a $2.1 billion exit in May 2025. Opened a 3,000 sq ft AI Lab on Clements Ferry Road called "Charleston AI" in January 2026 to help local individuals and organizations understand and use artificial intelligence. Authored several books: World War AI, Speak In The Past Tense, Ideas Have People, The Coming AI Subconscious, Robot Noon, and Love, The Cosmic Dance to name a few.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from John Rector

Subscribe now to keep reading and get access to the full archive.

Continue reading