Published

January 2026

How neural populations integrate evidence, anticipate reward, and combine content with context.

evidence accumulation

Evidence accumulation from experience and observation in the cingulate cortex

evidence accumulation

We constantly update our beliefs about the world through two primary channels: our own direct experiences and our observations of others. To understand how the brain integrates these two distinct evidence streams, the authors designed a volatile two-player game where humans or pairs of monkeys chose between two arenas. Only one arena yielded rewards, and the "correct" arena covertly switched in blocks.

On each trial, one player was randomly assigned as the Actor (directly playing to collect tokens and earn rewards) while the other was the Observer (watching the exact same events unfold but receiving no reward). Both humans and monkeys successfully tracked the hidden state of the game, updating their beliefs based on both actor and observer trials. However, they systematically discounted observational evidence.

To uncover the neural mechanics of this integration, the researchers recorded population activity in the subjects' brains. Specifically, they focused on the anterior cingulate cortex, which is a known hub for decision-making . Furthermore the ACC was already known to have a role in “vicarious learning,” that is learning from others. The authors discovered that actor outcomes and observer outcomes are mapped onto strictly orthogonal dimensions in neural state space. Both of these outcome dimensions then are projected onto a shared downstream neural axis called the “switch evidence” dimension—ultimately the decision axis.

The ACC was found to keep the Actor Outcome dimension geometrically closer to the Switch Evidence dimension than the Observer Outcome dimension. Because of this alignment, an unrewarded trial as an actor projects a massive update onto the decision axis, whereas the exact same visual outcome as an observer projects a much smaller, discounted update.




Predictive coding of reward in the hippocampus

hippocampus-rl

The hippocampus is known to be a cognitive map as well a spatial one. To answer this, researchers investigated how the hippocampus evolves over extended periods of learning. They demonstrate that the hippocampal CA1 acts as a predictive reward map that dynamically reorganizes itself over weeks of experience to forecast future rewards. Using miniscope calcium imaging, the authors tracked hundreds of dorsal CA1 neurons in mice over several weeks.

The core empirical breakthrough of the paper is the discovery of a structured, backward-shifting reorganization of neural activity:

  • Early Learning (Reward-Driven): Initially, the CA1 population strongly represents the reward itself.
  • Late Learning (Prediction-Driven): Over weeks of experience, the population-level encoding of the reward decreases. Concurrently, the neural representation of the cues and behaviors that precede the reward strongly increases.

To explain how and why this reorganization occurs, the authors modeled the place fields using TD learning. As the animal learns the task structure, the TD prediction error propagates backward to earlier, predictive states. Simulations using this TD model successfully recapitulated the backward neural shifts observed in the in vivo calcium data. This paper demonstrates much like the classic reward prediction error signals seen in dopaminergic neurons, hippocampal representations undergo a massive temporal shift.


Distinct neuronal populations in the human brain combine content and context

combinatorial-coding

How does the human brain construct episodic memories without needing a unique neuron for every possible combination of an object and a situation? To answer this, researchers investigated how the medial temporal lobe (MTL)—the brain's primary memory center—integrates content (the object being perceived) with context (the cognitive task or rule giving that object meaning). By recording from 3,109 single neurons in patients performing a flexible picture comparison task, the authors mapped how these two streams of information interact at both the single-cell and population levels.

A prevailing assumption in memory research is that the hippocampus relies heavily on non-linearly mixed, "conjunctive" neurons (cells that fire only for a specific image during a specific task). However, this study revealed the opposite architecture in the human MTL:

  • Distinct Neural Populations: The MTL overwhelmingly relies on functionally segregated populations. "Stimulus neurons" encode the identity of the picture regardless of the cognitive context, while "context neurons" encode the overarching question or rule being applied, independent of the picture shown.
  • Orthogonal Subspaces: Conjunctive neurons were surprisingly rare. Population-level analyses demonstrated that the MTL keeps these two variables geometrically isolated in orthogonal dimensions, co-activating them simultaneously without entangling them.

These findings reveal that human episodic memory relies on a combinatorial code rather than a mixed one. By representing the “what” (content) and “where” (context) one independent axes, the brain avoids the combinatorial explosion of needing a new neuron for every possible life event.



Share:
1

References