Title: Temporal Reward Prediction in the Visual Corticostriatal Circuit
Future reward prediction is critical for decision-making, and a key component of this process is to accurately judge when to expect a reward. However, how the brain learns and tracks time in response to a predictive cue remains poorly understood. Temporal Difference Reinforcement Learning (TDRL) has been proposed to describe the neural computations of time-related decision making, yet the TDRL agent does not achieve reward rate maximization, nor does it accurately describe animal behavior. PARSUIT theory is a modified version of the TDRL that correct the bias in state value estimation, and thus mitigate the errors between the performance of regular TDRL and animal behavior.
To investigate the neural implementations of temporal reward encoding and validate the algorithm used by the brain, the visual cortico-striatal circuit (VC -> DS) is an appealing system. Previous work has demonstrated “reward timing” and “action timing” generated in the visual cortex (VC). VC also directly innervates the dorsal striatum (DS), known for the timing of motor actions. The goal of this project is to use PARSUIT theory as a computational model to study how temporal reward information propagates and is represented in the VC -> DS circuit to inform time investment decisions.