In predictable environments organisms often use habitual forms of behavior as a means to conserve cognitive resources (Bouton, 2021, Linnebank et al., 2018, Wood et al., 2002). When stressed, organisms also sacrifice costly forms of decision-making for habits to liberate cognitive resources for escaping perceived threats (Hermans et al., 2014). The dichotomy between habitual and goal-directed modes of decision-making is formalized in dual process theories of instrumental behavior, which impose a tension between a habitual system driven by stimulus–response associations, and a goal-directed system governed by action-outcome associations (Adams and Dickinson, 1981, Dickinson and Balleine, 1994). Stress induces a bias towards habitual control in instrumental conditioning experiments (Dias-Ferreira et al., 2009, Dougherty et al., 2024, Schwabe and Wolf, 2009, Schwabe and Wolf, 2010). However, emerging research has increasingly advanced the idea that goal-directed and habitual systems operate hierarchically and may both be involved in goal selection, planning, and response performance (Ballard et al., 2024, Balleine and Dezfouli, 2019, Du et al., 2022, Favila et al., 2024, Ferguson et al., 2024, Frölich et al., 2023, Morris and Cushman, 2019). These theories suggest that rather than directly competing for control over behavior, responses may be organized through collaborative integration of goal-directed and habitual processes.
Hierarchical theories suggest that habits are the result of sequentially performed actions becoming “chunked” into a functional unit with repeated performance (Balleine and Dezfouli, 2019, Dezfouli et al., 2014, Smith and Graybiel, 2013, Smith and Graybiel, 2016). Action chunking has been identified in various settings including motor skill learning, free-operant procedures, and spatial navigation (Dezfouli and Balleine, 2013, Dezfouli and Balleine, 2019, Halbout et al., 2019, Keele, 1968, Nissen and Bullemer, 1987, Pew, 1966, Smith and Graybiel, 2013, van Elzelingen et al., 2022). By unitizing actions that are executed together into a sequence, a course of action can be evaluated based on the expected value of the sequence as a whole (e.g., ∑R1 + R2 + R3) (Dezfouli & Balleine, 2012). Once initiated, the action sequence is performed to completion and hence becomes insensitive to changes within the sequence. Functionally, this reduces time spent in deliberation over the outcome of each action within the sequence. Such a process captures characteristics often attributed to habits, such as reduced latency and more efficient performance (Garr & Delamater, 2019). A hierarchical controller selects between performing an action sequence versus using goal-directed evaluation of individual actions. Once selected, each action in a chunked action sequence serves as the eliciting stimulus for its successor. Thus, whether actions occur as discrete (goal-directed) responses or as a (habitual) chunked sequence can be assessed by introducing test trials in which accurate performance requires re-evaluation of the response after initiating the sequence (Daw et al., 2011, Dezfouli and Balleine, 2013, Dezfouli and Balleine, 2019).
Hierarchical theories of habitual behavior suggest that chunked action sequences may underly habitual behaviors and provide a parsimonious account of how habits can be integrated within overarching goal-directed control (Balleine and Dezfouli, 2019, Dezfouli and Balleine, 2012, Dezfouli and Balleine, 2013). However, little is known about how stress may affect hierarchical control over goal-direction and habit in instrumental behavior. Here, we extend this theoretical approach to investigate how acute stress influences the tendency toward goal-directed and habitual action sequences.
Stress can influence several types of computations involved in serial decision-making (Cremer et al., 2021, Otto et al., 2013, Otto et al., 2013, Radenbach et al., 2015, Raio et al., 2020), but no studies have tested whether stress promotes action chunking. The goal of this experiment was to test the influence of stress on whether rats perform habitual action sequences during serial decision-making. To accomplish this, rats received a within-subject evaluation of acute (60 min) restraint stress on performance in a two-stage decision-making task (Dezfouli & Balleine, 2019). The procedure is diagrammed in Fig. 1. Stage 1 consists of a choice between two response alternatives (R1 and R2) that initiate a sequence (R1 → R2 or R2 → R1). Rats learn to choose in stage 1 based on the outcome of recent trials (rewarded or non-rewarded) and select the stage 1 action likely to earn reward. In stage 2, rats must select the appropriate subsequent action that earns the reinforcer (i.e., R1 → R2 or R2 → R1). Rats can learn to make a correct stage 2 response according to either its regularity of reinforcement following the stage 1 choice, or according to its association with reward based on the stage 2 discriminative stimulus (Fig. 1A). This task can include test trials (Probe trials) to evaluate whether action sequences are being performed using an action chunking strategy or by evaluating each individual action. On these trials, the discriminative stimulus for the stage 2 action is switched in order to assess whether the rat completes the action sequence initiated in stage 1 or uses the discriminative stimulus to select the appropriate stage 2 action to earn a reward. Thus, the probe trials can discern between habitual action sequences where stage 1 and 2 actions are executed as a chunked unit without separate evaluation of consequences, or as discrete successive choices where outcomes or discriminative stimuli are used to separately guide action selection in each stage. We hypothesized that stress would increase the hierarchical goal-directed selection of habitual action sequences, and that stress would interfere with rats’ ability to flexibly select discrete actions after initiating a sequence.
Comments (0)