A Corticothalamic Circuit Trades off Speed for Safety during Decision-Making under Motivational Conflict

Decisions to act while pursuing goals in the presence of danger must be made quickly but safely. Premature decisions risk injury or death, whereas postponing decisions risk goal loss. Here we show how mice resolve these competing demands. Using microstructural behavioral analyses, we identified the spatiotemporal dynamics of approach–avoidance decisions under motivational conflict in male mice. Then we used cognitive modeling to show that these dynamics reflect the speeded decision-making mechanisms used by humans and nonhuman primates, with mice trading off decision speed for safety of choice when danger loomed. Using calcium imaging in paraventricular thalamus and optogenetic inhibition of the prelimbic cortex to paraventricular thalamus pathway, we show that this speed-safety trade off occurs because increases in paraventricular thalamus activity increase decision caution, thereby increasing approach–avoid decision times in the presence of danger. Our findings demonstrate that a discrete brain circuit involving the paraventricular thalamus and its prefrontal input adjusts decision caution during motivational conflict, trading off decision speed for decision safety when danger is close. We identify the corticothalamic pathway as central to cognitive control during decision-making under conflict. SIGNIFICANCE STATEMENT Foraging animals balance the need to seek food and energy against the conflicting needs to avoid injury and predation. This competition is fundamental to survival but rarely has a stable, correct solution. Here we show that approach–avoid decisions under motivational conflict involve strategic adjustments in decision caution controlled via a top-down corticothalamic pathway from the prelimbic cortex to the paraventricular thalamus. We identify a novel corticothalamic mechanism for cognitive control that is applicable across a range of motivated behaviors and mark paraventricular thalamus and its prefrontal cortical input as targets to remediate the deficits in decision caution characteristic of unsafe and impulsive choices.


Introduction
Foraging animals balance the need to seek food and energy against the conflicting needs to avoid injury and predation. Resolving this conflict is fundamental to survival, but there is rarely a stable, single solution. Instead, appropriate solutions vary across space and time (diurnal, seasonal) as well as internal states (hunger, thirst). These decisions thus involve compiling sensory information about the world with knowledge about reward or danger to support adaptive behavior. These decisions must be made quickly and safely. They require balancing the competing demands of speed in decision-making with safety of choice. Premature decisions risk injury or death, whereas failures to decide in a timely manner risk goal loss. Although much is known about the brain mechanisms of danger and reward (Mobbs et al., 2020), the mechanisms for this decision-making under motivational conflict are poorly understood.
Nonetheless, there are at least three outstanding questions about the roles of these regions in motivational conflict. First, although the behavioral designs used in many of these studies clearly identify roles in approach-avoidance conflict, these designs typically do not isolate the discrete psychological mechanisms of approach-avoid decisions. So how these regions relate to specific approach-avoid decision-making mechanics is poorly understood. Second, although both PVT and vSub are implicated in behavioral responses under approach-avoid conflict, the different roles of these regions are not known because they have not been directly compared. Finally, the circuit mechanisms for these roles in approach-avoidance conflict are poorly understood. For example, PL is a major source of excitatory glutamatergic inputs driving PVT neuronal activity (Vertes, 2002;Li and Kirouac, 2012;Otis et al., 2019), but the role of this corticothalamic pathway in motivational conflict is unknown.
Here we used a well-established approach-avoidance task to address these issues. We trained mice to approach a goal with conflicting values of reward and punishment and studied the microstructure of behavior under this conflict. Then we used formal cognitive modeling to identify the latent mechanisms of choice under conflict. We next examined how the spatiotemporal activity dynamics of two brain regions (PVT and vSub) related to these approach-avoidance choice mechanics. Finally, we used circuit-specific optogenetic inhibition to establish a causal role for the PVT and its PL input in decision-making under motivational conflict. Our findings show that a discrete brain circuit involving the PVT and its prefrontal input dynamically adjusts decision caution when making choices under motivational conflict, trading off decision speed for decision safety when danger looms.

Materials and Methods
Subjects. Male C57BL/6J mice (Australian Resources Center) 8-10 weeks of age were used. They were housed in ventilated racks, in groups of 2-4, on corn cob bedding in a climate-controlled colony room maintained on 12:12 h light/dark cycle (0700 lights on). They had free access to food (Gordon's Mice Chow) and water until 2 d before commencement of behavioral training when they received 30 min of access to water each day for the remainder of the experiment. The animals were in good health. Mice will learn the task when not thirsty, but we adopted this manipulation because rodents often forage (and hence face and solve approach-avoid conflict) in deprivation states.
Experiments were approved by the UNSW Animal Care and Ethics Committee and performed in accordance with the Animal Research Act 1985 (NSW), under both ARRIVE guidelines and the National Health and Medical Research Council Code for the Care and Use of Animals for Scientific Purposes in Australia (2013).
Surgeries and viral injections. Mice were deeply anesthetized with 5% isoflurane in oxygen-enriched air after subcutaneous injection of 5 mg/kg carprofen (Rimadyl, Zoetis) and then fixed into a stereotaxic alignment instrument (Model 1900, Kopf Instruments). During surgery, mice were maintained on 1%-2.5% isoflurane. Before the scalp incision, a local injection of 0.1 ml Marcaine (0.5%) was made subcutaneously at the incision site. Ophthalmic gel (Viscotears, Alcon) was applied to avoid eye drying. Mice received injections of antibiotic (Duplocillin, 0.15 ml/kg of body weight subcutaneously) immediately after surgery.
Behavioral procedures. Experiment 1 (N = 8) studied the microstructural analyses of behavior under motivational conflict. They were placed into a linear track (120 cm [l] Â 15 cm [w] Â 40 cm [h]) constructed of Perspex. The first 22 cm was a Start box, constructed from gray and white Perspex walls and a Perspex floor. The Start box was separated from the remainder of the track by a removable Perspex door. The next 88 cm was the track proper constructed from white Perspex walls and a white Perspex floor. The Goal box was the last 10 cm and was constructed from gray Perspex walls and a stainless-steel grid floor. The Goal box contained a stainless-steel receptacle extending 3 cm from the end wall for delivery of liquid reward. A rail ran above the track to which all fiber optic patch cables were attached.
For training, there were four trials a day for 4 d. Each trial commenced with removal of the door between the Start box and the track; 10 ml of 8% sucrose was available from the receptacle in the Goal box. The trial ended after 2 min or after mice had consumed the sucrose. At the end of the trial, mice were returned to the Start box for 30 s until the start of the next trial.
For conflict, the same procedures were used but the grid floor in the Goal box was electrified using a 0.05, 0.075, or 0.1 mA current. Mice received 1 d of training at 0.05 mA, 2 d at 0.075 mA, and 2 d at 0.1 mA.
Mice were tracked (30 fps) via webcam (Logitech C920, 1080p) connected to a computer running EthoVision XT 10 (Noldus Information Technology). Ethovision tracked the x and y coordinates of the animal's center. From these coordinates, the following variables were computed: time spent in Start box, linear track, Goal box; distance from the goal; and velocity of the center-point of the animal. These data were imported into MATLAB R2018b (The MathWorks) for further analysis.
Fiber photometry. Experiment 2 (N = 12) used fiber photometry to study PVT and vSub during conflict. We expressed an AAV encoding gCaMP7f in the PVT and implanted a fiber optic cannula above the expression site. After reward training (see above), mice received 1 d of conflict training with 0.05 mA footshock. They were tested the following day with 0.05 mA footshock. During this test, mice were tethered to a single fiber optic patch cable attached to a rail that ran above the linear track, providing unhindered motion. Recordings were performed using Fiber Photometry Systems from Doric Lenses and Tucker Davis Technologies (RZ5P, Synapse). Excitation lights (465 nm Ca 21dependent and 405 nm isosbestic control signal) emitted from LEDs (LEDC1-B_FC, LEDC1-405_FC; Doric Lenses), controlled via dual-channel programmable LED drivers (LEDD_4, Doric Lenses), were channeled into 0.39 NA, Ø400 mm core multimode prebleached patch cables via a Doric Dual Fluorescence Mini Cube (FMC2, Doric Lenses). Light intensity at the tip of the patch was maintained at 10-30 mW across sessions. Ca 21 and isosbestic fluorescence were measured using femtowatt photoreceivers (Newport, 2151). Synapse software controlled and modulated excitation lights (465 nm: 209 Hz; 405 nm: 331 Hz), and demodulated and low-pass filtered (3 Hz) transduced fluorescence signals in real time via the RZ5P. Synapse/RZ5P also received timestamping TTL signals from Ethovision.
Electrophysiology. Experiment 3 (N = 6) used electrophysiology. We expressed an AAV encoding the eNpHR3.0 in the PL using the procedures described above and made whole-cell recordings from PVT neurons while electrically evoking activity in the PL!PVT pathway.
Electrical stimulation was provided by a Constant Voltage Isolated Stimulator (Digitimer), delivered through a borosilicate glass pipette filled with standard aCSF and with the tip broken to allow greater electrical access. The stimulating electrode was positioned above projecting fluorescent PL!PVT fibers around the PVT. Stimulator and cell locations were determined from live slices with the aid of a wide-field microscope (Zeiss Axio Examiner D1) equipped with 2.5Â (0.075 NA) and 5Â (0.16 NA) objectives. eYFP was visualized with a 470/40 excitation filter, 525/50 emission filter, and 495 dichroic filter. Optogenetic inhibition of these fibers was simultaneously delivered through the 20Â objective. Electrophysiological recordings were amplified using a Multiclamp 700B amplifier (Molecular Devices) filtered at 6 kHz and digitized at 20 kHz with a Digidata1440A (Molecular Devices). Recordings were controlled and analyzed offline using Axograph (Axograph). The locations of all recorded cells were mapped according to the Mouse Brain Atlas (Paxinos and Watson, 2019). The liquid junction potential (;9 mV) was not compensated for.
Electrically evoked currents were investigated with 99 V pulses (between 120 and 160 ms) delivered in 10 s intervals while holding the cell at À70 mV. Optogenetic inhibition occurred every second episode, to allow comparison between the inhibited and the noninhibited electrically stimulated postsynaptic currents. The protocol was repeated 100 times.
Optogenetics. Experiment 4 (N = 12) used optogenetics to study the causal role of the PL!PVT pathway in behavior under conflict. We expressed an AAV encoding the eNpHR3.0 or eYFP in the PL using the procedures described above and implanted a fiber optic cannula above the PVT to inhibit the PL!PVT pathway. After reward training (see above), mice received 1 d of conflict training with 0.05 mA footshock (see above). They were tested the following 2 d under conflict (0.05 mA). During both tests, mice were tethered to a single fiber optic patch cable attached to the rail that ran above the linear track, providing unhindered motion, and which connected to 625 nm LEDs (Doric Lenses) controlled by Ethovision. During one test (Off), there was no optical stimulation. During a second test (On), continuous 625 nm optical stimulation (10-12 mw/mm 2 measured at the tip of an unimplanted fiber) was delivered only when mice were located within 8 cm of the goal.
Fiber placements and AAV expression were determined via immunohistochemistry and native fluorescence. An eGFP antibody was used to detect AAV-expressing cells. Sections were washed with PB solution (0.1 M PB, pH 7.4), 50% ethanol, 50% ethanol with 3% hydrogen peroxidase, and then 5% normal horse serum (NHS) in PB for 30 min each. The sections were then incubated for 48 h in chicken antiserum against eGFP (1:2000; Invitrogen, catalog #A10262 RRID:AB_2534023) in 2% NHS-PBTx (0.2% Triton X-10 in PB) with 0.1% sodium azide at room temperature. After washing in PB, sections were incubated in biotinylated donkey anti-chicken (1:2000; Jackson ImmunoResearch Laboratories; catalog #703-035-155 RRID: AB_10015283 24 h at room temperature) in 2% NHS-PBTx. The sections were washed and incubated in avidin-biotinylated HRP complex (Vector Elite kit: avidin and biotin, each 6 ml/ml; Vector Laboratories) in PBTx. Then, the sections were washed in PB and 0.1 M acetate buffer (pH 6.0) and incubated (15 min) in a DAB solution containing 0.1% 3,3-diaminobenzidine, 0.8% D-glucose, and 0.016% ammonium chloride. Immunoreactivity was catalyzed by the addition of 0.2 ml/ml glucose oxidase (24 mg/ml, 307 U/mg; Sigma-Aldrich). Tissue was washed with PB and mounted onto gelatinized slides. Slides were left to dry and then cover-slipped. AAV expression and cannula placements were verified using light and fluorescent microscopy. Animals were excluded from analyses if fiber tip and AAV expression could not be confirmed as colocalized.
Linear ballistic accumulator (LBA) modelling. The LBA is an exemplar accumulation model of decision-making and the simplest, complete model of choice with an analytical solution. Choice in the LBA depends on five parameters (v, s, A, b, t 0 ,) where v is the accumulation rate for each response option (sampled on each trial from a normal distribution with mean v i and SD s i ), s is between-trial variation in v, A is the starting point of the accumulation process (sampled on each trial from a uniform distribution), b is the amount of evidence needed to make a response, and t 0 is the nondecision time (perceptual and motor processing) (Brown and Heathcote, 2008). Inferences about decision-making processes from LBA are similar to other sequential sampling models with the key advantage that the LBA can scale to any number of response options.
Following Annis et al. (2017), we set s to a constant value (1) and assumed accumulation rate priors for each response were truncated normal distributions (mean = 2, SD = 1), a uniform prior nondecision time (0, 1), maximum starting evidence for A was a truncated normal distribution (mean 0.5, SD = 1), and determined a relative threshold, k, from a truncated normal distribution (mean 0.5, SD = 1), from which we could derive b as k 1 A. Response caution was then defined as b -A/2. We used a Hamiltonian Markov Chain Monte Carlo (HMCMC) algorithm (warmup = 1000; iteration = 2000; thinning = 1; d = 0.8) to obtain posterior distributions. Three chains were run to evaluate convergence with a Gelman-Rubin's criteria of R^, 1.1 (Gelman and Rubin, 1992) and an effective sample size (Neff) . 100 (Gelman et al., 2013). All analyses were performed using R (version 4.0.2) (R Development Core Team, 2017) via R Studio (version 1.3.959) and the RStan package (Hoffman and Gelman, 2014;Stan Development Team, 2015).
Experimental design and statistical analyses. Data in figures are represented as individual data points overlaid with mean 6 SEM unless otherwise stated. The only criteria for inclusion in final analyses were correct AAV and/or fiber placements determined after histology. Group numbers for each experiment are indicated in Results. Inferential statistics were based on within-subjects t tests or Wilcoxon-signed rank tests, ANOVA, curve-fitting, and multiple regression. All analyses were conducted using GraphPad Prism version 8.4.2, MATLAB, SPSS 25, and the Psy statistical package.
Before locomotor behavior was analyzed, frames with no or poor tracking were identified and replaced using linear interpolation. Then, for each frame, a custom script used the x-center coordinates to determine whether the mouse was moving (!1.5 cm uninterrupted) toward the Start box, the Goal box, or paused (defined as no movement for !90 ms followed by movement of !1.5 cm). Only pauses that occurred on the track (i.e., outside the start or Goal box) were considered for analysis and modeling.
Electrophysiological analyses were performed using Axograph (Axograph). Data were excluded from analysis if .500 pA was required to maintain the neuron at -65 mV. Passive properties, such as input resistance and membrane capacitance, were calculated from injection of small, hyperpolarizing pulses (-5 mV) in voltage clamp at -65 mV. Membrane time constant was determined by fitting an exponential to the voltage deflection caused by a small hyperpolarizing current (-5 to -20 pA; 600 ms).
For fiber photometry, Ca 21 -dependent (465 nm-related) and isosbestic (405 nm-related) signals and event timestamps were extracted into MATLAB. The isosbestic signal was linearly regressed onto the Ca 21 -dependent signal to create a fitted isosbestic signal, and a normalized fluorescence change score (dF/F) was calculated using the standard formula: (Ca 21 -dependent signalfitted isosbestic)/fitted isosbestic. Phasic neural activity change around pauses was determined by collating dF/F around pause events (À5 s to 5 s around pause onset or offset, baselined to À5 s to À2.5 s before event). To determine significance of activity change, 95% CIs around activity kernels were derived via bootstrapping (Jean Richard dit Bressel et al., 2020). Bootstrapped means were obtained by randomly resampling from subject mean waveforms with replacement (1000 iterations). CI limits were derived from 2.5 and 97.5 percentiles of bootstrap distribution, expanded by a factor of H(n/(n -1)). A significant transient was identified as a period that CI limits did not contain 0 (baseline) for at least 1/3 s (low-pass filter window). To assess the relationship between neural activity and LBA parameters, mean dF/F during pauses at the three runway locations were calculated per subject and correlated against estimates of LBA parameters per subject for these locations.

Microstructure of behavior under conflict
In Experiment 1, we trained thirsty mice (n = 8) to run a linear track to obtain liquid sugar reward from a Goal box; and as expected, we found that latencies to retrieve the reward decreased across training (F (3,21) = 11.56, p = 0.002) (Fig. 1a,b). Conflict was then introduced by electrifying the floor of the Goal box, with current intensity increasing across days (0.05, 0.075, 0.1 mA). Under these conditions, mice learn reward and danger (Hull, 1938;Miller, 1944), which are retrieved via hippocampal awake replay (Wu et al., 2017) to support approach or avoidance decisions across the track. As expected, running speeds decreased (F (3,21) = 8.431, p = 0.0001), whereas time taken to obtain reward increased (F (3,21) = 75.21, p , 0.0001) across days of conflict training.
Visualization of individual trajectories showed that, in the absence of conflict, mice would proceed directly along the track to the goal and consume the reward (Fig. 2b). However, under conflict, mice exhibited bistable oscillations between the start and the end of the track (Fig. 2b). These oscillations were interrupted by pauses, ranging from milliseconds to seconds in duration. During these pauses, mice would rear, show lateral head scanning movements, and investigative sniffing (Wu et al., 2017;Thompson et al., 2018). These pauses were observed across the track, but they peaked at the start of the track and at the end of the track just before the goal (Fig. 2c), consistent with pauses reflecting active sampling of the environment to prompt individual approach or avoid decisions (Redish, 2016;Bach and Dayan, 2017;Mobbs et al., 2020). Pauses increased as shock intensity increased (F (1,7) = 47.957, p = 0.0001) (Fig. 2c, inset).
By identifying each pause, we could ask how each approachavoid decision was resolved. There were notable spatial biases in decision outcomes. Approach decisions dominated across most of the track but were replaced by avoidance decisions closer to the goal (Fig. 2c). In the absence of danger, there were few decisions, and these were resolved in favor of approach. In the presence of danger, the frequency of decision-making increased and the approach bias was lost (Fig. 2c, inset: 0 mA vs shock, F (1,17) = 10.265, p = 0.015).
Response times (RTs) to choose approach or avoid after pause onset were measured because RTs are the most robust and widely used measure of decision-making efficiency (Laming, 1968). RTs were positively skewed, log-normal functions (R 2 = 0.99, F (1,122) = 1312, p , 0.0001) with linear coefficients of variation (R 2 = 0.93, F (1,6) = 80.59, p , 0.0001) (Fig. 2d). k-means clustering identified distinct patterns of decision-making across the track. Pauses within 5 cm bins were collated, and median location, total number of pauses, median pause duration, % approach versus avoid decisions were used as inputs. Silhouette values for three clusters were positive (mean = 0.694, minimum = 0.243), indicating location thresholds at 5 cm from goal and 15 cm from start within the linear track, generating three zones of approach-avoid decision-making on the track itself (i.e., excluding start and Goal box): start zone (0-15 cm), mid zone (15-83 cm), and goal zone (83-88 cm). Approach-avoid decisions at the goal zone were most difficult, taking the longest time (main effect location: F (2,14) = 24.86, p = 0.0014; start vs goal: t (7) = 4.564, p = 0.026; start vs mid: t (7) = 5.555, p = 0.0009) (Fig. 2e). There was also a trade-off between the speed of decision-making and the safety of choice. Avoid decisions leading to safety were slower (i.e., more difficult) than approach decisions (Fig. 2f) (t (7) = 6.395, p = 0.0004).

Cognitive modeling of choice under conflict
These decision-making dynamics (log normal RT distributions, RTs with linear coefficient of variation, and trade-offs between decision speed and decision outcome) are shared with speeded choice in human  and nonhuman primates (Shadlen and Shohamy, 2016). They can be explained by sequential sampling models that decompose choice into its latent cognitive mechanisms (Wagenmakers and Brown, 2007;Brown and Heathcote, 2008;Forstmann et al., 2016;Ratcliff et al., 2016). Here, learned sources of reward and punishment are sampled from the environment and memory until an evidence threshold is reached and an approach or avoid choice is made.
The LBA (Brown and Heathcote, 2008) is one such sequential sampling model that permits a complete analytic solution for choices between any number of alternatives (Fig. 2g). We used Bayesian estimation via HMCMC to fit the LBA to each animal's approach-avoid decisions from Experiment 1 and derive estimates of three decision-making parameters for each mouse: the rate of evidence accumulation for an approach decision (i.e., v 1 , salience of reward), the rate of evidence accumulation for an avoid decision (i.e., v 2 , salience of danger), and the threshold of evidence required to reach a decision (i.e., response or decision caution).
The LBA fit the behavioral data well, explaining both RT distributions and choice outcomes (Extended Data Fig. 2-1). LBA parameter estimation for each mouse across the three zones (start, mid, goal) showed that evidence accumulation and decision caution varied dynamically across the track. There was a (3) (zone) Â (2) (reward vs danger salience) interaction (F (1,7) = 84.159, p = 0.00001). Reward salience increased across the track (Mid: t (7) = 8.134, p , 0.0001 vs avoid) but decreased significantly at the goal zone (t (7) = À5.555, p = 0.0005 vs avoid) (Fig. 2h), explaining the switch from approach to avoid decisions at the goal zone. There was also a significant increase in decision caution as mice approached the goal zone (F (1,7) = 35.544, p = 0.001) (Fig. 2i), showing that more evidence was required to reach a decision as danger loomed, explaining the increased RTs near the goal.
PVT but not vSub tracks approach-avoidance decisions Two brain regions, the vSub, located in the temporal lobe (Gray, 1982;Gray and McNaughton, 2000;Ito and Lee, 2016;Marchant et al., 2016;Çavdaroğlu et al., 2021), and the PVT, located in the midline thalamus (Choi and McNally, 2017;Zhu et al., 2018;Choi et al., 2019;Engelke et al., 2021;Ma et al., 2021), have been implicated in motivational conflict, and so could contribute to these decision-making dynamics. Yet how activity in these brain regions relates to approach-avoidance decision-making remains poorly understood. To address this, in Experiment 2, we infused an AAV encoding the calcium (Ca 21 ) sensor gCaMP7 (Dana et al., 2019) into the vSub or PVT and implanted a fiber optic cannula above these regions. We then used fiber photometry to record vSub (n = 5) or PVT (n = 6) population Ca 21 signals of mice during decision-making under conflict (Fig. 3a). We assessed the spatial profile of these Ca 21 signals across the track and the relationship between these signals and individual approach-avoid decisions. Fiber tips were located in PVT or vSub, but gCaMP expression extended beyond these regions. Nonetheless, given the presence of strong fluorescence immediately ventral to the fiber tips, we expect the PVT and vSub to have the greatest contribution to the signals recorded.
In contrast, PVT DF/F showed a spatial bias, with significant increases in DF/F only at the goal zone (Fig. 3b,c) (repeatedmeasures ANOVA, F (2,15) = 40.62, p , 0.0001; vs 0% DF/F -start: t (5) = 1.645, p = 0.1608; mid t (5) = 1.397, p = 0.2212; goal t (5) = 9.722, p = 0.002). PVT DF/F was unrelated to individual approach-avoid decisions if spatial location was ignored ( Fig. 3d; 95% CIs). Unlike vSub, PVT DF/F after choices in the start zone did not differentiate between approach decisions to stay on the track versus avoid decisions to return to the safety of the Start box ( Fig. 3e; 95% CIs). Instead, PVT DF/F selectively increased during approach-avoid decisions at the goal zone ( Fig. 3f; 95% CIs) but not elsewhere on the track, and there were significant reductions in DF/F during approach-avoid decisions at the start zone.
How do vSub and PVT neural dynamics relate to approachavoid decision-making dynamics? To answer this, we first used HMCMC to derive LBA decision-making parameters (reward salience, danger salience, decision caution) for each mouse during approach-avoid decisions across the three track zones (start zone, mid zone, goal zone), and then we correlated these LBA decision-making parameters for each mouse with their respective vSub and PVT DF/F during approach-avoid decisions. We found that vSub DF/F was unrelated to the components of approach-avoid decisions ( Fig. 3g) (all R 2 , 0.20, all p . 0.05). In contrast, there were strong fits between PVT DF/F and each LBA decision-making parameter (Fig. 3g) (approach R 2 = 0.68, F (1,16) = 33.59, p = 0.0001; avoid R 2 = 0.28, F (1,16) = 5.99, p = 0.0263; caution R 2 = 0.45, F (1,16) = 13.32, p = 0.0022). Changes in PVT Ca 21 dynamics were most strongly associated with the increases in decision caution and reductions in reward salience as animals approached the goal. Moreover, these LBA decisionmaking dynamics could be applied within a regression model to accurately predict PVT DF/F across the track (Fig. 3h)  A prelimbic fi PVT pathway controls decision caution These findings show that PVT Ca 21 dynamics closely track approach-avoidance decisions during motivational conflict but not how PVT contributes to these decisions. PVT DF/F covaried most strongly with dynamic reductions in reward salience and with increases in decision caution as mice approached the dangerous goal. So, PVT could contribute to these changes in reward salience (Zhu et al., 2018;Campus et al., 2019), response caution, or both. The PL is critical for approach-avoid decision-making (Verharen et al., 2019;Kyriazi et al., 2020;Fernandez-Leon et al., 2022) and a major source of excitatory glutamatergic inputs driving PVT neuronal activity (Vertes, 2002;Li and Kirouac, 2012;Otis et al., 2019). We hypothesized that the role of PVT likely depended on this prefrontal input.
First, we confirmed that we could photoinhibit the PL ! PVT pathway. In Experiment 3, we expressed an AAV encoding the inhibitory opsin halorhodopsin (eNpHR3.0) in the PL (Fig.  4a) and made whole-cell recordings from PVT neurons (N = 26) while evoking EPSCs using a stimulating electrode positioned above fluorescent PL fibers located around the anterior edge of the PVT. Electrical stimulation is not selective for PL fibers, and the PL is one of several brain regions providing excitatory input to PVT neurons, but we could evoke EPSCs in PVT neurons. Photostimulation reduced these EPSCs (Amplitude [pA] -Off vs On: t (25) = 4.990, p , 0.0001; percent change in amplitudesingle mean t test vs 0%: t (25) = 4.270, p = 0.0002), confirming the efficacy of photoinhibition of PL ! PVT terminals (Fig. 4b). The magnitude of the reduction was modest but consistent with the presumed fractional contribution of PL fibers to the nonselectively electrically evoked current in PVT neurons.
Next, in Experiment 4, we infused an AAV encoding eNpHR3.0 (n = 6) or the control eYFP (n = 6) into PL (Fig.  4a). We implanted an optic fiber above PVT, allowing us to photoinhibit the PL!PVT pathway. After reward training and a single day of conflict training, mice were tested twice under conflict, once in the absence of photoinhibition (Off) and once when the PL!PVT pathway was inhibited via 625 nm light (On) (Fig. 4c). We photoinhibited the PL!PVT pathway only when mice were at the goal zone, not elsewhere on the track, because fiber photometry had shown that the goal zone was the only location with significant increases in PVT Ca 21 transients. The 625 nm light was triggered by mouse entry to the goal zone, and the light remained on for the duration of the visit. If increases in PVT activity are important for dynamic reductions in reward salience, then inhibition should prevent these reductions and bias approach-avoid decisions toward approach at the goal zone. . PVT and vSub dynamics during conflict. a, AAV encoding gCaMP7F targeted to PVT or vSub. Fiber optic cannula implanted above injection site. Representative gCaMP7 expression and all fiber optic tip locations (white bars) in vSub and PVT. Distances in millimeters from bregma. Illustration from www.biorender.com. b, Mean %DF/F across the track in 5 cm blocks during conflict (PVT n = 6, vSub n = 5). c, Mean and SEM %DF/F for start, mid, and goal zone locations (excluding start and Goal boxes). d, Mean and SEM %DF/F at pause onset (0 s) across the track. Colored bars represent periods when 95% CI does not include 0% DF/F. e, Mean and SEM %DF/F at pause offset (0 s) by decision outcome at Start zone. f, PVT mean and SEM %DF/F at pause onsets by location. Colored bars represent periods when 95% CI does not include 0% DF/F. Gray bar represents significant difference between approach versus avoid via 95% CI. g, Relationship between LBA parameters for approach (reward salience), avoid (danger salience), response caution, and mean %DF/F during pauses across track zones fitted by individual mouse (colored points and lines) and overall (black dotted line) with overall R 2 . Right, Box plots of individual mouse correlation coefficients between LBA parameters and DF/F in vSub and PVT. h, Linear model of PVT Ca 21 transients from LBA parameters. *p , 0.05.
In contrast, if increases in PVT activity are important for dynamic increases in decision caution as danger looms, then inhibition should prevent this increase in caution and reduce decision speeds at the goal zone.
Instead, inhibiting the PL!PVT pathway hastened decision speeds in the goal zone (  . A corticothalamic pathway controls response caution. a, AAV encoding the inhibitory halorhodopsin (eNpHR3.0 n = 6) or eYFP (n = 6) was targeted to PL and a fiber optic cannula implanted above PVT. AAV expression (each mouse is shown at 15% opacity) in PFC and fiber optic tip location in PVT. Distances in millimeters from bregma. Illustration from www.biorender. com. b, Electrical stimulation of the PL!PVT pathway evoked postsynaptic currents in PVT neurons that were significantly reduced by photoinhibition of PL terminals in the PVT. c, Mice were tested under conflict in the absence (Off) or presence of photoinhibition (On), with photoinhibition delivered only at the goal zone. d, Mean and SEM number of visits to goal zone on the track and duration of stay per visit. e, Mean and SEM decision frequency and outcome at goal zone on tests with (On) and without (Off) photoinhibition. f, Mean and SEM RTs at goal and mid zones. g, Mean and SEM RTs at goal and mid zones for eNpHR3.0 mice on test without (Off) and with (On) photoinhibition. *p , 0.05. safety when mice were making approach-avoid decisions (Fig.  4g) (Light Off Approach vs Avoid Wilcoxon T = À2.023 p = 0.043; Light On: Wilcoxon T = À0.943, p = 0.345). This was specific to the track location where photoinhibition occurred because there was no effect on approach versus avoid decision speeds in the mid zone (Light Off Approach vs Avoid Wilcoxon T = 0.677, p = 0.498; Light On: Wilcoxon T = À0.105, p = 0.917).

Discussion
Decisions about whether to approach or avoid a food source while under the threat of danger and predation require balancing the competing demands of speed in decision-making with safety of choice. Premature decisions risk injury or death, whereas failures to decide in a timely manner risk goal loss. Here we studied how mice make these approach-avoid decisions. We show dynamic changes in approach-avoidance decisions that depend on both the proximity of danger and choice outcome. Most importantly, we show that mice trade off decision-speed for decision safety when making approach-avoid decisions. We show that this trade-off between decision speed and decision safety is linked to a corticothalamic pathway from the PL to the PVT that dynamically adjusts decision caution as danger nears.

Approach-avoid decision-making under motivational conflict
Our findings show that decision-making under fundamental survival conditions in mice shares the same lawful features as speeded decision-making in human  and nonhuman primates (Shadlen and Shohamy, 2016). Sequential sampling is a highly efficient, domain-general solution to the problem of fast, accurate decision-making that uses common mechanisms to explain decision outcome and decision time (Gold and Shadlen, 2002;Gluth et al., 2015;Forstmann et al., 2016;Shadlen and Shohamy, 2016;Bakkour et al., 2019). We show that decision-making during motivational conflict in mice has the hallmarks of a sequential sampling mechanism that is shared by human decision-making during word versus nonword recognition , visual hallucinations (Pearson and Brascamp, 2008), and in nonhuman primate motion perception (Shadlen and Newsome, 2001), among others.
Specifically, the spatial and temporal dynamics of behavior were well explained by the LBA model, an exemplar sequential sampling model (Donkin et al., 2011a, b). Formal cognitive modeling showed that there were dynamic changes in the salience of reward and danger across space, with reductions in reward salience as danger loomed. There were also strategic adjustments in response caution across space. Most caution was exercised closest to the goal, leading to longer decision times at the goal. Together, these dynamic changes in decision-making generated a bistable phenotype with mice oscillating between the relative safety of the Start box and the relative danger of the goal, and trading off speed in decision-making for safety of choice.
These findings underscore the utility of computational process models of choice, such as the LBA, to understanding motivated behavior. These models have been used to study neural correlates of sensory decision-making in nonhuman primates (e.g., Shadlen and Newsome, 1996;Gold and Shadlen, 2007;Brody and Hanks, 2016;Shadlen and Shohamy, 2016) but are yet to see widespread application to problems in motivation (Walters et al., 2019;McNally, 2021). Key advantages of this approach are that it allows a rich examination of behavior because of joint modeling of RTs as well as choice outcome and that it makes specific predictions about the structure of underlying neural processes (Johnson and Ratcliff, 2018). However, although our findings highlight the utility of these models to explaining behavior under motivational conflict, much remains to be learned. For example, we studied only male mice, so the nature and role of sex differences in approach-avoidance decision-making remain to be determined. The modeling approach taken here may be useful in identifying sex differences in underlying choice mechanics not otherwise obvious from behavior. In addition, although we show that these models can successfully describe behavior under relatively simple and predictable conditions, it remains unclear whether these models can explain behavior under more complex, less predictable conditions. Under complex conditions, other more sophisticated deliberation mechanisms involving memory exploration and information search may supplement or replace the processes described here (Redish, 2016;Walters et al., 2019;Hunt et al., 2021).
Regardless, many behavioral tasks assessing choice in the laboratory under notionally Pavlovian or instrumental conditions (e.g., Pavlovian to instrumental transfer; sign tracking vs goal tracking; natural vs drug reward; social vs drug reward) are conducted under conditions similar to those used here. The focus in these tasks is often on how environmental or neural manipulations affect choice outcome. However, as shown here, choice behavior even under simple conditions is richer than simply what is chosen. It can involve biases, dynamic changes in decision speed, and trade-offs between decision speed and decision outcome. The mechanisms for such choices could differ across tasks (e.g., Pavlovian vs instrumental), but the application of formal cognitive models to behavior in these tasks to jointly model choice outcome and RTs may provide a useful addition to traditional associative approaches in identifying computational similarities in information processing and their underlying circuit mechanisms across these distinct tasks (Redish et al., 2022).

Role of the corticothalamic pathway and vSub in motivational conflict
A top-down corticothalamic pathway controls strategic adjustments in decision caution during decision-making under motivational conflict. Strategic adjustments in decision or response caution are among the most elementary forms of cognitive control. We show here that increases in PVT activity, as inferred from increases in PVT Ca 21 transients, were selectively observed during approach-avoid decisions at the goal. There were no increases in PVT Ca 21 transients during the same behaviors at other locations on the track. Cognitive modeling indicated that these increases in PVT Ca 21 transients reflected an increase in the amount of evidence required to reach a decision, thereby increasing the time taken to choose between approaching or avoiding the dangerous goal. This increased evidence threshold drives a trade-off between the speed of choice and the safety of choice. Consistent with these model predictions, inhibiting the PL ! PVT pathway prevented the increase in decision times at the dangerous goal and abolished the speed-safety trade-off.
In humans, neuroimaging studies link the trade-off between speed and accuracy in perceptual decision-making to corticostriatal circuits (Forstmann et al., 2008Bogacz et al., 2010;Winkel et al., 2016), but ensemble recordings in rodent striatum have not consistently identified similar signals (Stott and Redish, 2014;Brody and Hanks, 2016). Instead, corticostriatal circuits have well-established roles in integrating stimulus value with action information to control value-based choice (Balleine and O'Doherty, 2010;Hannah and Aron, 2021;Weglage et al., 2021;Tang et al., 2022). The role we identify for PVT in decision caution is complementary to these corticostriatal choice mechanisms. PVT neurons have extensive, highly collateralized projections to ventral corticostriatal circuits (Dong et al., 2017). So, one possibility is that PVT broadcasts an evidence threshold across ventral corticostriatal value networks to control the speed of value-based decision-making.
The role of vSub in these processes is less clear. vSub is implicated in behavior under motivational conflict (Ito and Lee, 2016;Marchant et al., 2016;Çavdaroğlu et al., 2021). However, the role of vSub in conflict behavior here was distinct to that of PVT. Whereas PVT showed spatially and behaviorally selective increases in Ca 21 DF/F, vSub Ca 21 DF/F was elevated nonselectively across the track, including modest elevations during choices. Unlike PVT, there was no significant relationship between vSub Ca 21 DF/F and individual decisionmaking parameters. So, we consider it unlikely that vSub contributes directly to approach or avoid decisions. Instead, vSub may be linked to anxiety and arousal in this task. Gray and McNaughton (Gray, 1982;Gray and McNaughton, 2000) argued that approach-avoidance conflict recruits a behavioral inhibition system localized to the septohippocampal system. The behavioral inhibition system supports changes in arousal and attention. The behavioral inhibition system remains recruited while conflict persists but not when avoidance behavior successfully removes the animal from the conflict situation. The profile of vSub DF/F reported here, increasing with proximity to the goal and decreasing after choices to avoid to the safety of home, is consistent with this (Gray, 1982;Gray and McNaughton, 2000).
In conclusion, our findings demonstrate that a discrete brain circuit involving the PVT and its prefrontal cortical input dynamically adjusts decision caution during motivational conflict, trading off decision speed for decision safety when danger is close. They identify the corticothalamic pathway as central to cognitive control during decision-making under conflict. PVT has been implicated in a variety of motivated behaviors, but a general account of its function remains elusive (Kirouac, 2015). Many PVT-dependent tasks involve choices between different, incompatible behaviors (McNally, 2021). These include choices between approach and avoid (Choi and McNally, 2017;Choi et al., 2019;Engelke et al., 2021), between different defensive behaviors (e.g., fight vs flight) (Ma et al., 2021), between approaching different sources of reward (e.g., sign vs goal tracking, Pavlovian to instrumental transfer) (Campus et al., 2019), and between persisting with or ceasing a behavior that no longer yields reward (Hamlin et al., 2009;Otis et al., 2017). Like the approach-avoidance choices studied here, these choices necessitate trade-off between the speed and outcome of decision-making. Our finding that PVT controls this trade-off by determining the amount of caution exercised in making a choice provides a mechanism for cognitive control applicable across a range of behaviors and tasks. Moreover, it identifies PVT and its prefrontal cortical input as targets to understand and remediate the deficits in decision caution characteristic of unsafe or impulsive choices.