Karma, Vāsanā, and the Brain as a Prediction Machine

"Mind precedes all mental states. Mind is their chief; they are all mind-wrought. If one speaks or acts with an impure mind, suffering follows him like the wheel that follows the foot of the ox."

— Dhammapada, verse 1

Prologue: a new way to read karma

For centuries, karma has been misread as a kind of metaphysical accounting system — what you did in past lives, you pay for in this one. That reading is half-right and wholly incomplete. Read the Buddhist traditions carefully, and you find a more precise concept: vāsanā — the imprinting of repeated actions onto the substrate of mind.

A note on terminology: vāsanā is most systematically developed in the Yogācāra (Mind-Only) school of Mahāyāna Buddhism, by Vasubandhu and Asaṅga. In the Pāli canon (Theravada), the closest cousins are anusaya (latent tendencies) and saṅkhāra (formations). This essay uses vāsanā because it captures the "residue of habit" most cleanly for what modern neural networks describe — not to flatten the differences between Buddhist schools.

Twenty-five centuries later, modern neuroscience describes nearly the same phenomenon in different language: Hebbian learning — neurons that fire together, wire together. Every time a behavioral or emotional pattern is triggered, the relevant synapses are strengthened. Repeat enough times, and the pattern becomes default mode — running automatically, without conscious effort.

Vāsanā is the functional analog of Hebbian weights accumulated over tens of thousands of forward passes.

I. Karma as a machine learning model

1.1. The training loop of consciousness

If we view each human being as a model under continuous training, the loop looks like this:

flowchart TB
    A["Stimulus — Sense object"] --> B["Perception — Vedanā"]
    B --> C["Pattern match — Saññā"]
    C --> D["Reaction — Saṅkhāra"]
    D --> E["Outcome — Vipāka"]
    E -->|Dopamine reward| F["Weight update — Vāsanā"]
    F -.->|Reinforce| C
    F -.->|Strengthen pathway| D

    style F fill:#7c3aed,stroke:#5b21b6,color:#fff
    style A fill:#1e293b,stroke:#475569,color:#fff
    style E fill:#0f766e,stroke:#0d9488,color:#fff

Each loop is a step of gradient descent on the loss function of suffering. The catch: the loss function the brain optimizes is not long-term well-being — it's short-term dopamine. This is the root of upādāna (clinging) in the Four Noble Truths: we are continuously reinforced to grasp at things that, in the long run, produce suffering.

1.2. Suffering is a local minimum

Attachment, in ML terms, is a local minimum that has been finetuned too deeply. The model cannot escape because every nearby gradient leads back to the same point. This is why people know they are destroying themselves — drinking, staying in toxic relationships, scrolling endlessly — and yet cannot stop.

Buddhist concept	Modern equivalent	Mechanism
Vāsanā (Yogācāra: latent imprints)	Trained weights	Hebbian potentiation
Upādāna (clinging)	Local minimum	Dopamine reinforcement loop
Saṃsāra (cyclic existence)	Default inference mode	Default Mode Network
Saṅkhāra (formations)	Activation function	Conditioned neural firing
Nibbāna (liberation)	Global restructure of the loss landscape	Large-scale network rewiring

Reading karma this way does not diminish its sacredness — on the contrary, it shows why the Buddha's teaching is empirical as much as it is metaphysical. The word ehipassiko in the canon — often translated as "come and see" — closer to its root is an invitation to direct verification.

II. Sīla — Samādhi — Paññā: the three-stage transformation pipeline

The threefold training (tisikkhā) — sīla, samādhi, paññā — is not three separate practices. It is a three-stage pipeline operating on three different layers of the cognitive system.

flowchart TB
    subgraph L3["Paññā — Wisdom — Cognitive Reframing"]
        T1["Recognize impermanence of pattern"]
        T2["Disidentify from pattern"]
        T3["Insight: anattā"]
    end

    subgraph L2["Samādhi — Concentration — Attention Training"]
        D1["Observe without reacting"]
        D2["Insert gap between stimulus and response"]
        D3["Expand metacognitive bandwidth"]
    end

    subgraph L1["Sīla — Ethics — Environment Design"]
        S1["Block triggering inputs"]
        S2["Reduce cue density"]
        S3["Add friction to automatic behavior"]
    end

    L1 ==> L2
    L2 ==> L3
    L3 -.->|Rewire| L1

    style L1 fill:#1e3a8a,stroke:#1e40af,color:#fff
    style L2 fill:#6b21a8,stroke:#7e22ce,color:#fff
    style L3 fill:#065f46,stroke:#047857,color:#fff

2.1. Sīla — environment design

Sīla is not abstract morality. It is environmental engineering to cut off the triggers that activate old patterns. When James Clear writes in Atomic Habits that you should "make it invisible, make it difficult", he is restating sīla in the language of behavioral design.

Without sīla, samādhi has trouble arising — because the brain keeps receiving cues and keeps being yanked into reactive mode.

2.2. Samādhi — attention training

Samādhi is the capacity to hold attention stable long enough to observe without reacting. This is precisely what fMRI studies of experienced meditators reveal: activity in the anterior cingulate cortex (the attention regulator) increases, while the amygdala (the fast emotional reactor) loses its grip.

The gap between stimulus and response — the gap Viktor Frankl described — is where freedom lives. Samādhi creates that gap.

2.3. Paññā — cognitive reframing

Paññā is not knowledge. Paññā is direct insight into the impermanent, selfless nature of the pattern. When you see anger as merely a current flowing through synapses — not you, not yours, not your essence — the identification dissolves. And when identification dissolves, the pattern loses its fuel.

This is not armchair philosophy. This is cognitive defusion in Acceptance and Commitment Therapy (ACT) — one of the most empirically robust therapeutic techniques available today.

III. Why 'I know but I can't do it'? The brain as a prediction machine

Note: An earlier draft of this essay relied on Paul MacLean's triune brain theory (neocortex / limbic / reptilian). This is a popular metaphor but one that modern neuroscience has rejected (Cesario et al. 2020; Barrett 2020): the human brain did not evolve as discrete layers stacked atop one another, and the "reptilian brain" is not a real comparative-anatomy structure. The section below has been rewritten using three contemporary frameworks with stronger evidence: predictive coding, constructed emotion, and distributed large-scale networks. All three are influential but still actively debated — specific caveats appear inline below.

The human brain is not three layers fighting for control. Under Karl Friston's predictive coding framework — one of the most influential paradigms in computational neuroscience today, though far from settled (see Andy Clark and various critics) — the brain is a prediction machine. It continually generates predictions about the next sensory input based on its prior model, which is the accumulated residue of all past karma. When actual input diverges from prediction, a prediction error arises — and must be resolved one of two ways: update the model, or act on the world so it conforms to the prediction.

A tangible example: you pick up your usual coffee mug, expecting it to weigh X grams. If someone has secretly emptied it without you knowing, the moment you lift — your hand jerks sharply upward in a strange way. That's because your arm pre-programmed its force to match the prediction, not the actual weight. The prediction error is literally in your hand. The whole of mental life works the same way — just more subtly.

flowchart TB
    subgraph Predict["The brain as a prediction machine — Predictive Coding"]
        direction TB
        Prior["Prior model — accumulated as vāsanā"]
        Predict1["Predicts next sensory input"]
        Sensory["Actual sensory input"]
        Error["Prediction error"]
        Update["Update model or act"]

        Prior --> Predict1
        Predict1 --> Error
        Sensory --> Error
        Error --> Update
        Update -.->|Reinforces priors| Prior
    end

    subgraph Networks["Three distributed brain networks — not three layers"]
        DMN["Default Mode Network (DMN) — narrative, rumination, 'self'"]
        SN["Salience Network (SN) — detects what matters"]
        CEN["Central Executive Network (CEN) — planning, control"]

        SN -->|Switches between| DMN
        SN -->|Switches between| CEN
        DMN <-->|Compete for resources| CEN
    end

    Update --> SN

    style Prior fill:#7c3aed,stroke:#5b21b6,color:#fff
    style Error fill:#dc2626,stroke:#ef4444,color:#fff
    style DMN fill:#9333ea,stroke:#a855f7,color:#fff
    style SN fill:#0369a1,stroke:#0284c7,color:#fff
    style CEN fill:#065f46,stroke:#047857,color:#fff

3.1. Emotions are not in fixed regions — they are constructed

Lisa Feldman Barrett, through extensive meta-analyses, has argued: there is no "fear center" in the amygdala, no "anger center" in the limbic system. In her constructed emotion theory, emotions are not instinctive responses from an ancient brain region. They are constructed in real time by the whole brain, drawing on:

Interoception — signals from the body (heart rate, hormones, blood sugar).
Context — where you are, with whom, when.
Learned emotion concepts — language and culture teach you to label internal states.

Important caveat: this is an influential theory but not consensus. Joseph LeDoux — who has studied the amygdala and fear for decades — and Jaak Panksepp (affective neuroscience) push back, arguing that certain evolutionarily conserved "core affects" do exist at subcortical layers. The most cautious reading: certain physiological responses are fairly stable across cultures, but the labeling and interpretation of those responses into specific emotions is highly constructed — the latter is what this essay leans on.

Example: the same body signal — racing heart, sweaty palms, shallow breath — can be labeled by your brain as many different emotions depending on context:

Before an investor pitch → anxiety ("I'm not ready").

Before a first date → excitement ("I really like this person").

Right after finishing a 5K → exhilaration ("I just did that").

Standing on a 30th-floor balcony → fear of heights.

The body says almost the same thing. The brain does the labeling. And that label — learned from culture, from language, from past labelings — determines what you do next.

If you read Barrett with a grain of salt — keeping the "constructed labeling" layer intact — this view is actually closer to Buddhist thought than MacLean's theory ever was: emotions are saṅkhāra — formations constructed from conditions, not fixed entities. And because they are constructed, they can be reconstructed.

3.2. Behavior emerges from switching between three brain networks

Instead of three stacked layers, modern neuroscience identifies three large-scale brain networks that operate in a distributed manner across the cortex, competing and switching continuously (Menon 2011):

Default Mode Network (DMN) — active when you're not doing anything specific: narrative, recall, future planning, "mind wandering". This is where the "self" is woven.
Salience Network (SN) — detects what matters, what deserves attention. It switches between DMN and CEN.
Central Executive Network (CEN) — planning, willpower, deliberate decision-making.

"Knowing but not doing" doesn't happen because a "reptilian brain" hijacks the system. It happens because:

1. Priors are too strong — the prior model has been reinforced to the point where it forces sensory input to conform to it instead of the other way around.

Example: You broke up with an ex a year ago, and rationally you've "moved on". One day, your phone buzzes — and their name lights up the screen. Heart racing, stomach tight, mouth dry. You "know" you're over them. But the prior "this name = emotional danger" has been reinforced thousands of times over years — your brain fires the entire physiological response before CEN can say "that's old news". Sensory input gets forced to fit the old prior.

A more mundane example: driving your usual route to work. One morning you plan to stop somewhere different — but you turn automatically toward the old office. By the time you snap out of it, you're 2 km off route. The prior "this intersection = turn right" overrides the freshly loaded intention in CEN.

2. The Salience Network has been co-opted by modern attention engineering — TikTok, Instagram, and games are designed to keep SN continually pulling you back to DMN (aimless scrolling) instead of CEN (purposeful work).

Example: You open your phone to check the time. Ninety seconds later you're on the fourth reel about cats and dogs. You didn't decide to open Instagram — your Salience Network did. Every notification, every red badge, every quivering thumbnail has been A/B-tested across billions of users to hit your SN before CEN can say "wait". This is not a willpower issue — it's thousands of well-paid engineers sitting in offices optimizing the job of hacking your brain networks.

Another work example: you open your laptop intending to write a proposal. The Slack tab pulses. You click. Twenty minutes later you're still chatting. SN treats a red notification as more "salient" than the planned task — not because it actually is, but because SN has been trained over years to treat notifications as urgent.

3. CEN is metabolically expensive — willpower burns glucose and mental resources. Once depleted (ego depletion), control reverts to learned automatic loops. (Caveat: Baumeister's original ego depletion construct has serious replication problems; but the broader phenomenon of decision fatigue and the metabolic cost of cognitive control is widely accepted.)

Example: 9 AM, you resolve to eat clean, order a salad for lunch, decline the cake a colleague offers. 10 PM, after a day of intense meetings and hard decisions, you're on the couch with half a bag of chips and a cup of instant noodles. You haven't changed your values — you still want to eat healthy. But CEN is depleted. Control falls to whichever prior is strongest — and the prior "stress → snack food" has been reinforced thousands of times in your life.

This is also why people say don't make big decisions when hungry, and why CEOs like Steve Jobs and Mark Zuckerberg famously wore the same outfit every day — they were offloading CEN to save energy for decisions that actually mattered.

3.3. This is not a bug — it's a feature

The subtlest point: predictive coding explains why the gap between understanding and doing is necessary, not a design flaw.

"Understanding" something new by reading means inscribing a weak conceptual prior into the prefrontal cortex (CEN). But the rest of your priors — about the body, emotions, habits — have been reinforced by tens of thousands of repetitions and are orders of magnitude stronger.

If "knowing" were enough to mean "doing", then:

There would be no mechanism to test commitment through repeated training — only behaviors repeated in real environments deserve encoding into long-term potentiation.
There would be no filter against fleeting dangerous ideas — old priors function as a kind of regularization, keeping behavior stable.
The brain's networks would lose their capacity for temporal smoothing: someone who reads one book and totally changes behavior is the last person you'd want to trust.

The gap between understanding and doing is the gap between updating a conceptual prior and updating an experiential prior. The former takes minutes of reading. The latter takes hundreds or thousands of conscious repetitions in real environments.

One last example: reading Atomic Habits in six hours — you understand environment design. But to actually become someone who wakes up early and exercises, you have to put your shoes by the door, move your alarm clock across the room, go to bed early — and repeat for 60-90 days. The book updates the conceptual prior. The 90 days of repetition update the experiential prior.

3.4. Exceptions: when priors update fast

The framing above can give the impression that every prior requires tens of thousands of repetitions. Reality is more nuanced. There are conditions under which a prior can shift almost instantly:

One-trial learning under trauma: a single dog bite at age five can install a "dogs = danger" prior that lasts a lifetime. Conversely, a single well-conducted EMDR session or appropriately structured therapy can sometimes reprocess that prior surprisingly fast.
Insight epiphany: in cognitive therapy, single-session insight events can collapse entire belief structures in minutes. In contemplative traditions, kenshō / sudden seeing-into-nature moments are described similarly.
High-construction states — research on psilocybin-assisted therapy (Johns Hopkins, Imperial College London) shows that one or two carefully structured sessions can produce lasting changes in personality traits and core beliefs.

The caveat: these cases typically require special conditions — very high emotional intensity, altered states of consciousness, or a highly structured therapeutic frame. In ordinary days, with ordinary willpower and ordinary environments, slow gradient descent remains the rule. Betting that you'll be the next exception is itself one of the more subtle forms of avoidance.

The Buddha did not call his teaching paṭipadā ("path") by accident. It is a path — to be walked, not read.

IV. Awakening: successful rewiring

Putting it all together:

flowchart TB
    A["Stimulus"] --> B{"Sati — the gap"}
    B -->|No sati| C["Old pattern — automatic"]
    B -->|With sati| D["Observe"]
    D --> E["See impermanence"]
    E --> F["Disidentify"]
    F --> G["Pattern loses fuel"]
    G --> H["Synaptic pruning"]
    H --> I["Rewire to new pathway"]

    style B fill:#7c3aed,stroke:#5b21b6,color:#fff
    style C fill:#991b1b,stroke:#b91c1c,color:#fff
    style I fill:#15803d,stroke:#16a34a,color:#fff

Awakening is not a metaphysical state. It is the state in which:

Old priors have been updated through tens of thousands of conscious experiential repetitions — prediction errors are gradually incorporated into the model rather than ignored.
The Salience Network has been retrained to switch more cleanly between DMN and CEN — practitioners are less swept into "mental scrolling".
The DMN's grip loosens — the "self" becomes more translucent, consistent with the anattā experience meditators describe.

Judson Brewer, Richard Davidson, and many contemplative neuroscience labs have observed via fMRI and EEG that experienced meditators show decreased DMN activity, higher network plasticity, and measurable changes in grey matter density across cognitive regions (PFC, insula).

A caveat about the evidence: meditation research contributes important findings, but the broader picture is messier than popular accounts suggest. Large meta-analyses (Goyal 2014, Van Dam 2018) find that effect sizes for MBSR / standardized meditation programs on depression and anxiety are small to moderate (not the life-changing magnitude often suggested in popular media), and many early neuroimaging studies suffered from small samples, replication problems, and self-selection bias. The cautious reading: meditation does change the brain, but think of it as something like regular physical exercise — helpful, cumulative, and far from miraculous.

V. Conclusion: why you must practice, not just read

If you read this and nod along — nothing changes.

Reading updates your conceptual prior — your internal model of the world grows slightly richer. But it does not touch priors about interoception, automatic behavior, or reward loops — and that is where actual behavior is decided.

To rewire, you must:

Sīla: redesign your environment to reduce the cue density that activates old priors (delete apps, change routes, avoid people who trigger old patterns).
Samādhi: sit down 10-20 minutes a day, observe the breath, observe emotions, observe thoughts — training your Salience Network to make cleaner choices.
Paññā: when a pattern arises, see that it is a construction, not the essence of you.

Repeat. For months. For years.

That is the distance between someone who reads about meditation and someone who has meditated. Between someone who understands karma and someone who has dissolved it. Between a model trained on theoretical data and a model finetuned on lived experience.

"Like a pearl inside an oyster — it takes countless grains of sand, countless wounds, before it takes shape."

Perhaps this is why we are born with this difficult brain: not to suffer, but to have the opportunity to transform. An AI can be retrained in hours. A human takes a lifetime. Whether that slowness is "the substance of meaning" or simply the language we use to describe a biological constraint — the fact that biological neural networks cannot be finetuned as fast as silicon — is a question each person answers for themselves. But the slowness is real, and the path still has to be walked.

If you've made it to this line — try pausing for 30 seconds. Watch your breath. You don't have to do anything else. That is already the first step.