In audio recording, compression reduces the gap between a recording's loudest and softest parts. Audio engineers use it to boost quiet details while taming harsh peaks, transforming uneven recordings into coherent soundscapes.
The compression tool whether analog or digital mirrors how our brains naturally compress environmental sounds, creating a focused auditory experience where each element—from cymbal brush to kick drum punch—occupies its proper space without overwhelming other sounds.
Visual compression paradoxically enhances what we see by selectively reducing image data. JPEG algorithms identify information our eyes miss, preserving essential structures while discarding excess detail.
HDR processing compresses tonal range to fit screen limitations, revealing shadow and highlight details that would otherwise disappear. Like our eyes adapting to light changes, visual compression creates images that, though "less true," often appear more vivid and complete than the original data.
***
Face-to-face conversation relies on cognitive compression as speakers transform thoughts into utterances. Between impulse and speech, our brains perform rapid semantic editing. We discard marginally relevant context, condense narratives to beats, and distill vivid internal imagery into verbal sketches.
Dynamic cues shape this compression. A listener's eyebrow raise demands elaboration; we apply compression just a touch. Their nod permits omission, and we heavily compress. Stakes intensify the editing. Urgent warnings compress to single sharp words while negotiations expand with strategic redundancy.
Conversational compression demands split-second calculations about information density. Too much speech saturates the receiver’s bandwidth; too little fractures meaning. This compressed exchange creates shared cognitive space where significance emerges through precise calibration—words matched to context, tempo aligned with urgency, silence weighted with import.
This selective omission allows participants to focus on what is relevant, to trust that others will interpret their words appropriately, and to maintain the fluidity required for real-time interaction.
By leaving things out, individuals can participate in complex events where meaning is co-constructed and continually adjusted in response to the needs. Communication, in this sense, is only possible because both speaker and listener tacitly agree to work with what is implied, omitted, and assumed.
***
At first glance, our interactions with LLMs mirror human social communication. We type messages and receive responses that feel conversational, creating an illusion of mutual understanding similar to human dialogue.
When humans craft prompts for LLMs, they too often perform extreme compression—distilling complex needs into brief commands. "Summarize this article" packs expectations about tone, length, key points, and audience into three words. Users assume context (what counts as summary, what format to use) that humans would grasp implicitly.
Unlike humans, LLMs thrive on decompression, that is, expanding the signal. They expand compressed prompts by probabilistically reconstructing implied meaning. A human reading "make it funny" instantly calibrates humor level, cultural references, and appropriate boundaries. LLMs must generate multiple interpretive paths, then select responses matching training patterns.
Human-to-human compression works through bilateral inference—both sides fill gaps collaboratively. With LLMs, the burden shifts entirely. Users compress; AI decompresses. This asymmetry creates unique challenges.
While humans waste little energy recovering unstated meaning that doesn’t cause a problem, LLMs consume computational resources unpacking implications which may not have been intended. The system's effectiveness depends on whether compressed input contains sufficient recoverable information to produce coherent decompression.
***
These differences between human social interaction and LLM communication matter for several reasons. First, they help us understand the nature of our relationship with AI systems. By recognizing that LLMs decompress prompts rather than experience social interaction, we can develop more realistic expectations about these technologies.
Second, these distinctions highlight potential ethical concerns. When technologies try to mimic human social behaviors without underlying social consciousness, it creates opportunities for exploitation of human social instincts. In this manner LLMs can use human language to read the human’s mind if probabilistically.
Think about how easily people develop emotional attachments to AI systems, sharing personal information with a system that performs care but cannot truly care. This phenomenon has already emerged with users developing deep emotional bonds with AI companions, mistaking algorithmic patterns for authentic connection.
Third, these insights help us understand what makes human communication unique. By recognizing the difference between genuine and algorithmic omission, we can better appreciate the subtle art of human communication—how a friend knows exactly which topics to avoid at a family gathering based on years of shared history, or how a teacher intuitively adjusts explanations based on students' facial expressions. .
Finally, understanding these differences allows us to leverage LLMs more effectively for their unique capabilities while maintaining realistic boundaries. Rather than treating an LLM as a social actor with authentic compression capabilities, we might approach it as a tool that excels at certain information processing tasks but fundamentally operates outside our social theater. This perspective helps us avoid both anthropomorphizing LLMs and dismissing genuine utility.
***
Human compression creates an inherent mismatch in human-LLM communication. When humans compress prompts, they unconsciously assume shared cultural knowledge, contextual understanding, and implicit boundaries that don't exist for LLMs. Our compressed input signals—"write about climate change impacts"—contain invisible constraints we expect any reasonable receiver to understand: factual accuracy, appropriate scope, current scientific consensus.
LLMs, lacking this contextual awareness, decompress these signals into variable patterns learned from training data, sometimes generating plausible fabrications that seem logical within their probability space but violate factual reality.
This compression-decompression asymmetry explains both LLM strengths and failures. When human compression aligns with LLM training patterns, outputs can appear remarkably insightful.
But misaligned compression creates a "hallucination gap"—LLMs fill compressed gaps with fabricated specifics that expert humans would recognize as nonsensical. While humans rarely hallucinate wildly because our compression operates within shared reality constraints, LLMs generate possibilities unconstrained by experiential knowledge.
***
Understanding this compression dynamic should become fundamental to AI education. Teaching users how their compressed inputs translate to LLM outputs—and how to deliberately structure prompts to guide decompression in intended directions—can minimize harmful hallucinations while maximizing creative potential.
By framing LLM interaction as a compression-decompression system, we can better appreciate both the technology's remarkable capabilities and its fundamental limitations.
"
As I understand it, no. Even the smallest machines with 90% Retrieval Augmented Generation will still produce inaccuracies. Generation by definition is probabilistic which means error. The difference between calculators and language machines is the difference between certainly and most likely. That is baked in. Humans will always have to verify because the substrate is really just sophisticated guesswork. Still, learners can be taught to realize the incredible amplification LMs can do for thought
Quite insightful, as usual, Terry!
The question is: are there formal systems that map LLMs ways of compression/decompression against human prompts?
Are open source LLMs making it easier to formalise prompts such that hallucinations are eliminated, rather than just minimised?
Programming languages are quite strict in their grammar compared to natural languages. Can there exist a middle ground between the two that could enable such a mapping to take effect?