We have always credited evolution to random genetic mutations and natural selection. But what if the brain — the very organ shaped by evolution — has been quietly steering it all along? The more we study neuroplasticity, activity-dependent development, and epigenetics, the harder it becomes to see the brain as just a passive product of evolution. It may be one of its most active drivers.
The Brain That Rewires Itself
Unlike an LLM whose weights freeze after training, the human brain rewires itself continuously throughout life. Every new skill, memory, or habit physically reshapes synaptic connections. This property — neuroplasticity — means the brain is never truly finished. It is a living architecture under constant self-modification, guided by experience rather than a fixed genetic blueprint.
The brain is not a record player replaying fixed tracks. It is a composer, rewriting the score every single day.
The Newborn That Teaches Us Everything
Consider a newborn. Their legs exist at birth, but they are structurally unready — bones soft, muscles undeveloped, hip sockets shallow. As the child begins to crawl and eventually walk, something remarkable happens in parallel. The brain sends movement signals. The bones respond by increasing density and remodelling their internal structure through a principle called Wolff's Law. The hip socket deepens as the femoral head repeatedly pushes against it. The spine shifts from a C-curve to an S-curve specifically because of upright load-bearing. None of this is pre-scheduled by genetics alone. It is triggered by use — by the brain deciding to move.
What Happens When the Brain Never Sends the Signal?
Documented cases of children raised in severe neglect or without normal movement opportunities reveal something sobering. Their underdevelopment is not limited to motor skills. Bone geometry, muscle fibre composition, and spinal structure remain in juvenile states. The body waited for instructions from the brain that never came. This confirms that physical maturation is not purely a genetic clock — it is activity-gated, dependent on neural signals to proceed.
The Baldwin Effect — When Learning Steers Evolution
Traditional evolutionary theory treats the brain as a downstream product — shaped by genes, selected by environment. The Baldwin Effect proposes a more interesting relationship. When a species learns a new behaviour, that behaviour reshapes the survival landscape. Over generations, genetic mutations that make that behaviour easier get selected for. What was once laboriously learned becomes gradually hardwired. The brain does not rewrite DNA directly — but by changing what behaviours are possible, it changes what traits evolution rewards.
- A species learns a new survival behaviour through brain-driven adaptation
- That behaviour creates new environmental pressures and advantages
- Genetic mutations that support the behaviour more efficiently are selected
- Over generations, the learned becomes the instinctive
- The brain has effectively guided its own evolutionary redesign
Epigenetics — The Brain's Legacy Beyond One Lifetime
Perhaps the most striking evidence comes from epigenetics. Experience — stress, physical activity, trauma, nutrition — can alter which genes get switched on or off without changing the DNA sequence itself. Some of these changes are heritable. Studies have shown that the effects of a grandmother's famine can measurably influence a grandchild's metabolism. The brain's lived experience leaves a biological legacy that can echo across multiple generations, quietly nudging the evolutionary trajectory of a lineage.
Your DNA is the hardware. But your brain's experiences may be quietly rewriting which parts of that hardware future generations are born running.
A New Way to See the Relationship
Piece it all together and a different picture of evolution emerges. Brain activity shapes body structure within a single lifetime. Repeated behaviour across a population shapes what traits get selected across generations. Epigenetic markers carry the memory of experience forward in time. The brain is not simply what evolution produced — it is an active participant in directing where evolution goes next. It sits at the front of the loop, not the end of it.
The Uncomfortable Conclusion
If the brain is genuinely steering evolution — through activity-dependent development, behavioural selection pressure, and epigenetic inheritance — then the standard model of evolution as a blind, brain-independent process is incomplete. It does not mean genes no longer matter. It means the brain is not waiting passively for genetics to hand it its next upgrade. In a very real sense, it has been placing the order all along.
We have always credited evolution to random genetic mutations and natural selection. But what if the brain — the very organ shaped by evolution — has been quietly steering it all along? The more we study neuroplasticity, activity-dependent development, and epigenetics, the harder it becomes to see the brain as just a passive product of evolution. It may be one of its most active drivers. And to truly appreciate how extraordinary this is, it helps to compare it against the most sophisticated learning systems humanity has ever built — Large Language Models.
The Brain That Rewires Itself
Unlike an LLM whose weights freeze after training, the human brain rewires itself continuously throughout life. Every new skill, memory, or habit physically reshapes synaptic connections. This property — neuroplasticity — means the brain is never truly finished. It is a living architecture under constant self-modification, guided by experience rather than a fixed genetic blueprint.
Where LLMs Fundamentally Differ
A Large Language Model trains on vast amounts of data, adjusting billions of numerical weights until patterns stabilise. Then it stops. The weights freeze. Every conversation you have with an LLM after that point runs on the same static snapshot of knowledge it had on the day training ended. It cannot learn from you. It cannot update itself based on new experience. In contrast, your brain is processing, revising, and restructuring itself right now, as you read this sentence. The gap between these two systems is not just technical — it is architectural at the deepest level.
The brain is not a record player replaying fixed tracks. It is a composer, rewriting the score every single day.
Superposition, Interference, and Why Bigger Models Aren't the Answer
When LLMs store knowledge, they compress information into high-dimensional vector spaces. Researchers at Anthropic have formally described a phenomenon called the Superposition Hypothesis — the model tries to store more features than it has dimensions, so features get packed into overlapping directions in that space. These overlapping directions interfere with each other, reducing accuracy and causing the model to confuse related concepts. The industry's primary solution has been to scale up — more parameters means more orthogonal dimensions, which means less interference. It works, but at an enormous cost in processing power and energy consumption.
The human brain solved this problem through a completely different architectural decision — sparse coding. At any given moment, only one to five percent of your neurons are actively firing. The rest are silent. This natural sparsity means your brain's active representations almost never overlap, eliminating the interference problem without requiring you to physically scale up your skull. It is a solution that took hundreds of millions of years of evolution to arrive at, and it is fundamentally more elegant than adding more matrix dimensions.
The Energy Problem No One Talks About
A modern large language model running inference can consume as much energy as a small household over the course of a day. Training a frontier model consumes energy comparable to hundreds of transatlantic flights. Your brain — handling vision, language, memory, emotion, movement, and unconscious regulation simultaneously — runs on approximately 20 watts. That is less power than the bulb in a refrigerator. The reason is sparse activation. The brain never fires everything at once. LLMs, by contrast, activate the majority of their parameters for every single token they process. This is not a minor inefficiency — it is a fundamental architectural mismatch that scaling alone will never fix.
The Newborn That Teaches Us Everything
Consider a newborn. Their legs exist at birth, but they are structurally unready — bones soft, muscles undeveloped, hip sockets shallow. As the child begins to crawl and eventually walk, something remarkable happens in parallel. The brain sends movement signals. The bones respond by increasing density and remodelling their internal structure through a principle called Wolff's Law. The hip socket deepens as the femoral head repeatedly pushes against it. The spine shifts from a C-curve to an S-curve specifically because of upright load-bearing. None of this is pre-scheduled by genetics alone. It is triggered by use — by the brain deciding to move.
An LLM Has No Body, No Feedback, No Stakes
This is where the comparison becomes philosophically uncomfortable. An LLM has never felt the resistance of the ground. It has never experienced the consequence of falling. It has no proprioception, no pain, no hunger that makes certain information feel urgent and worth remembering. The brain's learning is grounded in embodied consequence — information that matters to your survival gets encoded deeply, redundantly, emotionally. Information that doesn't matter fades. An LLM treats a recipe for pasta and the physics of a nuclear reaction with the same flat statistical weight, because nothing has ever been at stake for it. This is not a hardware limitation. It is a missing dimension of existence.
Forgetting as a Feature, Not a Bug
Most people think of forgetting as a failure of memory. Neuroscience suggests the opposite. The brain actively and deliberately forgets. Synaptic pruning deletes weak, unused connections — heavily during adolescence, lightly throughout adulthood. During sleep, a process described by the Synaptic Homeostasis Hypothesis globally weakens synapses, resetting the signal-to-noise ratio so only the strongest memories survive into the next day. Forgetting is compression. It is the brain curating its own model to keep it sharp, relevant, and interference-free.
LLMs have no equivalent of this process. Once a weight is set during training, it does not decay from disuse. The model cannot identify which parts of its knowledge are stale, rarely needed, or actively causing interference with more important knowledge. Techniques like dropout and weight decay during training approximate this idea, but they are applied blindly rather than selectively. The brain prunes with surgical precision based on what actually gets used. LLMs are pruned by an engineer guessing at regularisation hyperparameters.
The brain does not forget because it is weak. It forgets because it is wise enough to know what no longer deserves to be remembered.
Memory Corruption, the Mandela Effect, and Shared Architecture
The brain's active reconstruction of memory has a well-documented side effect — false memories. Every time you recall a memory, you re-save it with slight emotional modifications based on your current state. Over years, a memory can drift significantly from its original form. This is called memory reconsolidation, and it is the neurological engine behind the Mandela Effect — the widespread phenomenon where large groups of people share the same specific false memory.
What makes this fascinating is that the errors are not random. Thousands of people converge on the same wrong answer because they share the same semantic architecture. The word cartoon primes the wrong spelling of Looney Tunes. The word breeze pulls Febreze toward a second E. These are predictable corruption patterns emerging from a shared cognitive operating system — millions of brains running the same inference on the same cultural input and producing the same systematic error. It is less like individual memory failure and more like a distributed bug in a shared model version.
And Some People Are Simply More Resistant
Not everyone experiences the same Mandela Effects. People with stronger synaptic encoding — those who processed the original information with deeper attention, emotional anchoring, or multi-modal context — are more resistant to reconsolidation drift. Their memories have higher signal strength, so the noise from social reinforcement and semantic bias does not reach the threshold needed to overwrite them. This is not a personality trait. It is a direct consequence of how richly a memory was encoded at the moment of formation — which brings us back to embodiment, stakes, and the advantage the brain holds over any statistical model.
The Baldwin Effect — When Learning Steers Evolution
Traditional evolutionary theory treats the brain as a downstream product — shaped by genes, selected by environment. The Baldwin Effect proposes a more interesting relationship. When a species learns a new behaviour, that behaviour reshapes the survival landscape. Over generations, genetic mutations that make that behaviour easier get selected for. What was once laboriously learned becomes gradually hardwired. The brain does not rewrite DNA directly — but by changing what behaviours are possible, it changes what traits evolution rewards.
- A species learns a new survival behaviour through brain-driven adaptation
- That behaviour creates new environmental pressures and advantages
- Genetic mutations that support the behaviour more efficiently are selected
- Over generations, the learned becomes the instinctive
- The brain has effectively guided its own evolutionary redesign
LLMs Cannot Do This — And It May Be the Ceiling of Scaling
No LLM influences the conditions of its own next training run. It cannot notice that a certain type of reasoning causes it to fail, flag that pattern, and ensure the next version is trained to handle it differently. Humans do this instinctively — we identify our own cognitive weaknesses, develop workarounds, teach those workarounds to our children, and over generations those workarounds become cognitive infrastructure. This is exactly why researchers like Yann LeCun and Gary Marcus argue that scaling transformers is architecturally insufficient for AGI. The system has no mechanism for steering its own evolution. It simply waits to be retrained by engineers.
Epigenetics — The Brain's Legacy Beyond One Lifetime
Perhaps the most striking evidence comes from epigenetics. Experience — stress, physical activity, trauma, nutrition — can alter which genes get switched on or off without changing the DNA sequence itself. Some of these changes are heritable. Studies have shown that the effects of a grandmother's famine can measurably influence a grandchild's metabolism. The brain's lived experience leaves a biological legacy that can echo across multiple generations, quietly nudging the evolutionary trajectory of a lineage.
Your DNA is the hardware. But your brain's experiences may be quietly rewriting which parts of that hardware future generations are born running.
What Would It Take for AI to Catch Up?
Researchers working on neuromorphic computing — hardware that mimics sparse, event-driven neural firing rather than brute-force matrix multiplication — are building chips like Intel's Loihi and IBM's NorthPole that begin to close the energy and architecture gap. Work on organoid intelligence, where lab-grown brain cells are used as literal computing substrates, pushes the boundary even further. These are not science fiction projects — they are peer-reviewed research programmes. But they are decades behind where transformer-based LLMs are today in practical usability.
The harder problem is continual learning — the ability to acquire new knowledge without overwriting old knowledge. Your brain does this effortlessly. You learned your friend's new phone number this morning without forgetting your own name. If you retrain an LLM on new data, it catastrophically overwrites portions of what it previously knew. This is called catastrophic forgetting, and it remains one of the most significant unsolved problems in machine learning. The brain solved it through architectural decisions — separate memory systems, sparse activation, sleep-based consolidation — that the transformer architecture simply does not have.
A New Way to See the Relationship
Piece it all together and a different picture of both evolution and intelligence emerges. Brain activity shapes body structure within a single lifetime. Repeated behaviour across a population shapes what traits get selected across generations. Epigenetic markers carry the memory of experience forward in time. The brain is not simply what evolution produced — it is an active participant in directing where evolution goes next. And the more honestly we compare it to our best artificial systems, the clearer it becomes that what we have built so far, however impressive, is not on the same architectural page.
The Uncomfortable Conclusion
Scaling language models will continue to produce useful, impressive, commercially valuable systems. But if the brain is genuinely steering evolution — through activity-dependent development, behavioural selection pressure, epigenetic inheritance, and embodied feedback loops — then the path to something that truly rivals it does not run through adding more parameters. It runs through rethinking the architecture entirely. The brain has been placing that order for hundreds of millions of years. We are still reading the menu.
We have always credited evolution to random genetic mutations and natural selection. But what if the brain — the very organ shaped by evolution — has been quietly steering it all along? The more we study neuroplasticity, activity-dependent development, and epigenetics, the harder it becomes to see the brain as just a passive product of evolution. It may be one of its most active drivers. And to truly appreciate how extraordinary this is, it helps to compare it against the most sophisticated learning systems humanity has ever built — Large Language Models.
The Brain That Rewires Itself
Unlike an LLM whose weights freeze after training, the human brain rewires itself continuously throughout life. Every new skill, memory, or habit physically reshapes synaptic connections. This property — neuroplasticity — means the brain is never truly finished. It is a living architecture under constant self-modification, guided by experience rather than a fixed genetic blueprint.
Where LLMs Fundamentally Differ
A Large Language Model trains on vast amounts of data, adjusting billions of numerical weights until patterns stabilise. Then it stops. The weights freeze. Every conversation you have with an LLM after that point runs on the same static snapshot of knowledge it had on the day training ended. It cannot learn from you. It cannot update itself based on new experience. In contrast, your brain is processing, revising, and restructuring itself right now, as you read this sentence. The gap between these two systems is not just technical — it is architectural at the deepest level.
The brain is not a record player replaying fixed tracks. It is a composer, rewriting the score every single day.
Superposition, Interference, and Why Bigger Models Aren't the Answer
When LLMs store knowledge, they compress information into high-dimensional vector spaces. Researchers at Anthropic have formally described a phenomenon called the Superposition Hypothesis — the model tries to store more features than it has dimensions, so features get packed into overlapping directions in that space. These overlapping directions interfere with each other, reducing accuracy and causing the model to confuse related concepts. The industry's primary solution has been to scale up — more parameters means more orthogonal dimensions, which means less interference. It works, but at an enormous cost in processing power and energy consumption.
The human brain solved this problem through a completely different architectural decision — sparse coding. At any given moment, only one to five percent of your neurons are actively firing. The rest are silent. This natural sparsity means your brain's active representations almost never overlap, eliminating the interference problem without requiring you to physically scale up your skull. It is a solution that took hundreds of millions of years of evolution to arrive at, and it is fundamentally more elegant than adding more matrix dimensions.
The Energy Problem No One Talks About
A modern large language model running inference can consume as much energy as a small household over the course of a day. Training a frontier model consumes energy comparable to hundreds of transatlantic flights. Your brain — handling vision, language, memory, emotion, movement, and unconscious regulation simultaneously — runs on approximately 20 watts. That is less power than the bulb in a refrigerator. The reason is sparse activation. The brain never fires everything at once. LLMs, by contrast, activate the majority of their parameters for every single token they process. This is not a minor inefficiency — it is a fundamental architectural mismatch that scaling alone will never fix.
The Newborn That Teaches Us Everything
Consider a newborn. Their legs exist at birth, but they are structurally unready — bones soft, muscles undeveloped, hip sockets shallow. As the child begins to crawl and eventually walk, something remarkable happens in parallel. The brain sends movement signals. The bones respond by increasing density and remodelling their internal structure through a principle called Wolff's Law. The hip socket deepens as the femoral head repeatedly pushes against it. The spine shifts from a C-curve to an S-curve specifically because of upright load-bearing. None of this is pre-scheduled by genetics alone. It is triggered by use — by the brain deciding to move.
An LLM Has No Body, No Feedback, No Stakes
This is where the comparison becomes philosophically uncomfortable. An LLM has never felt the resistance of the ground. It has never experienced the consequence of falling. It has no proprioception, no pain, no hunger that makes certain information feel urgent and worth remembering. The brain's learning is grounded in embodied consequence — information that matters to your survival gets encoded deeply, redundantly, emotionally. Information that doesn't matter fades. An LLM treats a recipe for pasta and the physics of a nuclear reaction with the same flat statistical weight, because nothing has ever been at stake for it. This is not a hardware limitation. It is a missing dimension of existence.
Forgetting as a Feature, Not a Bug
Most people think of forgetting as a failure of memory. Neuroscience suggests the opposite. The brain actively and deliberately forgets. Synaptic pruning deletes weak, unused connections — heavily during adolescence, lightly throughout adulthood. During sleep, a process described by the Synaptic Homeostasis Hypothesis globally weakens synapses, resetting the signal-to-noise ratio so only the strongest memories survive into the next day. Forgetting is compression. It is the brain curating its own model to keep it sharp, relevant, and interference-free.
LLMs have no equivalent of this process. Once a weight is set during training, it does not decay from disuse. The model cannot identify which parts of its knowledge are stale, rarely needed, or actively causing interference with more important knowledge. Techniques like dropout and weight decay during training approximate this idea, but they are applied blindly rather than selectively. The brain prunes with surgical precision based on what actually gets used. LLMs are pruned by an engineer guessing at regularisation hyperparameters.
The brain does not forget because it is weak. It forgets because it is wise enough to know what no longer deserves to be remembered.
Memory Corruption, the Mandela Effect, and Shared Architecture
The brain's active reconstruction of memory has a well-documented side effect — false memories. Every time you recall a memory, you re-save it with slight emotional modifications based on your current state. Over years, a memory can drift significantly from its original form. This is called memory reconsolidation, and it is the neurological engine behind the Mandela Effect — the widespread phenomenon where large groups of people share the same specific false memory.
What makes this fascinating is that the errors are not random. Thousands of people converge on the same wrong answer because they share the same semantic architecture. The word cartoon primes the wrong spelling of Looney Tunes. The word breeze pulls Febreze toward a second E. These are predictable corruption patterns emerging from a shared cognitive operating system — millions of brains running the same inference on the same cultural input and producing the same systematic error. It is less like individual memory failure and more like a distributed bug in a shared model version.
And Some People Are Simply More Resistant
Not everyone experiences the same Mandela Effects. People with stronger synaptic encoding — those who processed the original information with deeper attention, emotional anchoring, or multi-modal context — are more resistant to reconsolidation drift. Their memories have higher signal strength, so the noise from social reinforcement and semantic bias does not reach the threshold needed to overwrite them. This is not a personality trait. It is a direct consequence of how richly a memory was encoded at the moment of formation — which brings us back to embodiment, stakes, and the advantage the brain holds over any statistical model.
The Baldwin Effect — When Learning Steers Evolution
Traditional evolutionary theory treats the brain as a downstream product — shaped by genes, selected by environment. The Baldwin Effect proposes a more interesting relationship. When a species learns a new behaviour, that behaviour reshapes the survival landscape. Over generations, genetic mutations that make that behaviour easier get selected for. What was once laboriously learned becomes gradually hardwired. The brain does not rewrite DNA directly — but by changing what behaviours are possible, it changes what traits evolution rewards.
- A species learns a new survival behaviour through brain-driven adaptation
- That behaviour creates new environmental pressures and advantages
- Genetic mutations that support the behaviour more efficiently are selected
- Over generations, the learned becomes the instinctive
- The brain has effectively guided its own evolutionary redesign
LLMs Cannot Do This — And It May Be the Ceiling of Scaling
No LLM influences the conditions of its own next training run. It cannot notice that a certain type of reasoning causes it to fail, flag that pattern, and ensure the next version is trained to handle it differently. Humans do this instinctively — we identify our own cognitive weaknesses, develop workarounds, teach those workarounds to our children, and over generations those workarounds become cognitive infrastructure. This is exactly why researchers like Yann LeCun and Gary Marcus argue that scaling transformers is architecturally insufficient for AGI. The system has no mechanism for steering its own evolution. It simply waits to be retrained by engineers.
Epigenetics — The Brain's Legacy Beyond One Lifetime
Perhaps the most striking evidence comes from epigenetics. Experience — stress, physical activity, trauma, nutrition — can alter which genes get switched on or off without changing the DNA sequence itself. Some of these changes are heritable. Studies have shown that the effects of a grandmother's famine can measurably influence a grandchild's metabolism. The brain's lived experience leaves a biological legacy that can echo across multiple generations, quietly nudging the evolutionary trajectory of a lineage.
Your DNA is the hardware. But your brain's experiences may be quietly rewriting which parts of that hardware future generations are born running.
What Would It Take for AI to Catch Up?
Researchers working on neuromorphic computing — hardware that mimics sparse, event-driven neural firing rather than brute-force matrix multiplication — are building chips like Intel's Loihi and IBM's NorthPole that begin to close the energy and architecture gap. Work on organoid intelligence, where lab-grown brain cells are used as literal computing substrates, pushes the boundary even further. These are not science fiction projects — they are peer-reviewed research programmes. But they are decades behind where transformer-based LLMs are today in practical usability.
The harder problem is continual learning — the ability to acquire new knowledge without overwriting old knowledge. Your brain does this effortlessly. You learned your friend's new phone number this morning without forgetting your own name. If you retrain an LLM on new data, it catastrophically overwrites portions of what it previously knew. This is called catastrophic forgetting, and it remains one of the most significant unsolved problems in machine learning. The brain solved it through architectural decisions — separate memory systems, sparse activation, sleep-based consolidation — that the transformer architecture simply does not have.
A New Way to See the Relationship
Piece it all together and a different picture of both evolution and intelligence emerges. Brain activity shapes body structure within a single lifetime. Repeated behaviour across a population shapes what traits get selected across generations. Epigenetic markers carry the memory of experience forward in time. The brain is not simply what evolution produced — it is an active participant in directing where evolution goes next. And the more honestly we compare it to our best artificial systems, the clearer it becomes that what we have built so far, however impressive, is not on the same architectural page.
The Uncomfortable Conclusion
Scaling language models will continue to produce useful, impressive, commercially valuable systems. But if the brain is genuinely steering evolution — through activity-dependent development, behavioural selection pressure, epigenetic inheritance, and embodied feedback loops — then the path to something that truly rivals it does not run through adding more parameters. It runs through rethinking the architecture entirely. The brain has been placing that order for hundreds of millions of years. We are still reading the menu.
We have always credited evolution to random genetic mutations and natural selection. But what if the brain — the very organ shaped by evolution — has been quietly steering it all along? The more we study neuroplasticity, activity-dependent development, and epigenetics, the harder it becomes to see the brain as just a passive product of evolution. It may be one of its most active drivers. And to truly appreciate how extraordinary this is, it helps to compare it against the most sophisticated learning systems humanity has ever built — Large Language Models.
The Brain That Rewires Itself
Unlike an LLM whose weights freeze after training, the human brain rewires itself continuously throughout life. Every new skill, memory, or habit physically reshapes synaptic connections. This property — neuroplasticity — means the brain is never truly finished. It is a living architecture under constant self-modification, guided by experience rather than a fixed genetic blueprint.
Where LLMs Fundamentally Differ
A Large Language Model trains on vast amounts of data, adjusting billions of numerical weights until patterns stabilise. Then it stops. The weights freeze. Every conversation you have with an LLM after that point runs on the same static snapshot of knowledge it had on the day training ended. It cannot learn from you. It cannot update itself based on new experience. In contrast, your brain is processing, revising, and restructuring itself right now, as you read this sentence. The gap between these two systems is not just technical — it is architectural at the deepest level.
The brain is not a record player replaying fixed tracks. It is a composer, rewriting the score every single day.
Superposition, Interference, and Why Bigger Models Aren't the Answer
When LLMs store knowledge, they compress information into high-dimensional vector spaces. Researchers at Anthropic have formally described a phenomenon called the Superposition Hypothesis — the model tries to store more features than it has dimensions, so features get packed into overlapping directions in that space. These overlapping directions interfere with each other, reducing accuracy and causing the model to confuse related concepts. The industry's primary solution has been to scale up — more parameters means more orthogonal dimensions, which means less interference. It works, but at an enormous cost in processing power and energy consumption.
The human brain solved this problem through a completely different architectural decision — sparse coding. At any given moment, only one to five percent of your neurons are actively firing. The rest are silent. This natural sparsity means your brain's active representations almost never overlap, eliminating the interference problem without requiring you to physically scale up your skull. It is a solution that took hundreds of millions of years of evolution to arrive at, and it is fundamentally more elegant than adding more matrix dimensions.
The Energy Problem No One Talks About
A modern large language model running inference can consume as much energy as a small household over the course of a day. Training a frontier model consumes energy comparable to hundreds of transatlantic flights. Your brain — handling vision, language, memory, emotion, movement, and unconscious regulation simultaneously — runs on approximately 20 watts. That is less power than the bulb in a refrigerator. The reason is sparse activation. The brain never fires everything at once. LLMs, by contrast, activate the majority of their parameters for every single token they process. This is not a minor inefficiency — it is a fundamental architectural mismatch that scaling alone will never fix.
The Newborn That Teaches Us Everything
Consider a newborn. Their legs exist at birth, but they are structurally unready — bones soft, muscles undeveloped, hip sockets shallow. As the child begins to crawl and eventually walk, something remarkable happens in parallel. The brain sends movement signals. The bones respond by increasing density and remodelling their internal structure through a principle called Wolff's Law. The hip socket deepens as the femoral head repeatedly pushes against it. The spine shifts from a C-curve to an S-curve specifically because of upright load-bearing. None of this is pre-scheduled by genetics alone. It is triggered by use — by the brain deciding to move.
An LLM Has No Body, No Feedback, No Stakes
This is where the comparison becomes philosophically uncomfortable. An LLM has never felt the resistance of the ground. It has never experienced the consequence of falling. It has no proprioception, no pain, no hunger that makes certain information feel urgent and worth remembering. The brain's learning is grounded in embodied consequence — information that matters to your survival gets encoded deeply, redundantly, emotionally. Information that doesn't matter fades. An LLM treats a recipe for pasta and the physics of a nuclear reaction with the same flat statistical weight, because nothing has ever been at stake for it. This is not a hardware limitation. It is a missing dimension of existence.
Forgetting as a Feature, Not a Bug
Most people think of forgetting as a failure of memory. Neuroscience suggests the opposite. The brain actively and deliberately forgets. Synaptic pruning deletes weak, unused connections — heavily during adolescence, lightly throughout adulthood. During sleep, a process described by the Synaptic Homeostasis Hypothesis globally weakens synapses, resetting the signal-to-noise ratio so only the strongest memories survive into the next day. Forgetting is compression. It is the brain curating its own model to keep it sharp, relevant, and interference-free.
LLMs have no equivalent of this process. Once a weight is set during training, it does not decay from disuse. The model cannot identify which parts of its knowledge are stale, rarely needed, or actively causing interference with more important knowledge. Techniques like dropout and weight decay during training approximate this idea, but they are applied blindly rather than selectively. The brain prunes with surgical precision based on what actually gets used. LLMs are pruned by an engineer guessing at regularisation hyperparameters.
The brain does not forget because it is weak. It forgets because it is wise enough to know what no longer deserves to be remembered.
Memory Corruption, the Mandela Effect, and Shared Architecture
The brain's active reconstruction of memory has a well-documented side effect — false memories. Every time you recall a memory, you re-save it with slight emotional modifications based on your current state. Over years, a memory can drift significantly from its original form. This is called memory reconsolidation, and it is the neurological engine behind the Mandela Effect — the widespread phenomenon where large groups of people share the same specific false memory.
What makes this fascinating is that the errors are not random. Thousands of people converge on the same wrong answer because they share the same semantic architecture. The word cartoon primes the wrong spelling of Looney Tunes. The word breeze pulls Febreze toward a second E. These are predictable corruption patterns emerging from a shared cognitive operating system — millions of brains running the same inference on the same cultural input and producing the same systematic error. It is less like individual memory failure and more like a distributed bug in a shared model version.
And Some People Are Simply More Resistant
Not everyone experiences the same Mandela Effects. People with stronger synaptic encoding — those who processed the original information with deeper attention, emotional anchoring, or multi-modal context — are more resistant to reconsolidation drift. Their memories have higher signal strength, so the noise from social reinforcement and semantic bias does not reach the threshold needed to overwrite them. This is not a personality trait. It is a direct consequence of how richly a memory was encoded at the moment of formation — which brings us back to embodiment, stakes, and the advantage the brain holds over any statistical model.
The Baldwin Effect — When Learning Steers Evolution
Traditional evolutionary theory treats the brain as a downstream product — shaped by genes, selected by environment. The Baldwin Effect proposes a more interesting relationship. When a species learns a new behaviour, that behaviour reshapes the survival landscape. Over generations, genetic mutations that make that behaviour easier get selected for. What was once laboriously learned becomes gradually hardwired. The brain does not rewrite DNA directly — but by changing what behaviours are possible, it changes what traits evolution rewards.
- A species learns a new survival behaviour through brain-driven adaptation
- That behaviour creates new environmental pressures and advantages
- Genetic mutations that support the behaviour more efficiently are selected
- Over generations, the learned becomes the instinctive
- The brain has effectively guided its own evolutionary redesign
LLMs Cannot Do This — And It May Be the Ceiling of Scaling
No LLM influences the conditions of its own next training run. It cannot notice that a certain type of reasoning causes it to fail, flag that pattern, and ensure the next version is trained to handle it differently. Humans do this instinctively — we identify our own cognitive weaknesses, develop workarounds, teach those workarounds to our children, and over generations those workarounds become cognitive infrastructure. This is exactly why researchers like Yann LeCun and Gary Marcus argue that scaling transformers is architecturally insufficient for AGI. The system has no mechanism for steering its own evolution. It simply waits to be retrained by engineers.
Epigenetics — The Brain's Legacy Beyond One Lifetime
Perhaps the most striking evidence comes from epigenetics. Experience — stress, physical activity, trauma, nutrition — can alter which genes get switched on or off without changing the DNA sequence itself. Some of these changes are heritable. Studies have shown that the effects of a grandmother's famine can measurably influence a grandchild's metabolism. The brain's lived experience leaves a biological legacy that can echo across multiple generations, quietly nudging the evolutionary trajectory of a lineage.
Your DNA is the hardware. But your brain's experiences may be quietly rewriting which parts of that hardware future generations are born running.
What Would It Take for AI to Catch Up?
Researchers working on neuromorphic computing — hardware that mimics sparse, event-driven neural firing rather than brute-force matrix multiplication — are building chips like Intel's Loihi and IBM's NorthPole that begin to close the energy and architecture gap. Work on organoid intelligence, where lab-grown brain cells are used as literal computing substrates, pushes the boundary even further. These are not science fiction projects — they are peer-reviewed research programmes. But they are decades behind where transformer-based LLMs are today in practical usability.
The harder problem is continual learning — the ability to acquire new knowledge without overwriting old knowledge. Your brain does this effortlessly. You learned your friend's new phone number this morning without forgetting your own name. If you retrain an LLM on new data, it catastrophically overwrites portions of what it previously knew. This is called catastrophic forgetting, and it remains one of the most significant unsolved problems in machine learning. The brain solved it through architectural decisions — separate memory systems, sparse activation, sleep-based consolidation — that the transformer architecture simply does not have.
A New Way to See the Relationship
Piece it all together and a different picture of both evolution and intelligence emerges. Brain activity shapes body structure within a single lifetime. Repeated behaviour across a population shapes what traits get selected across generations. Epigenetic markers carry the memory of experience forward in time. The brain is not simply what evolution produced — it is an active participant in directing where evolution goes next. And the more honestly we compare it to our best artificial systems, the clearer it becomes that what we have built so far, however impressive, is not on the same architectural page.
The Uncomfortable Conclusion
Scaling language models will continue to produce useful, impressive, commercially valuable systems. But if the brain is genuinely steering evolution — through activity-dependent development, behavioural selection pressure, epigenetic inheritance, and embodied feedback loops — then the path to something that truly rivals it does not run through adding more parameters. It runs through rethinking the architecture entirely. The brain has been placing that order for hundreds of millions of years. We are still reading the menu.
Responses (0)
No responses yet. Be the first to share your thoughts.