During a decision task, such as selecting an item to purchase among several available items, a person may invoke cognitive processes to identify a potential response choice. During such deliberation, a person would periodically monitor their progress and accordingly strategize further steps to optimize their subjective success likelihood. Such an ability to monitor and control one’s cognitive processes is called Metacognition. In other words, Metacognition is a dynamic model of the ongoing cognitive processes with the goal of optimizing success likelihood given the cognitive constraints (Flavell, 1979, Nelson, 1990). To predict success likelihood, metacognitive processes construct a dynamic model utilizing ongoing cues such as consistency of evidence supporting a given response choice (Ackerman, 2019). Then, utilizing success likelihood prediction from the dynamic model along with other personal and task-specific objectives (e.g., prioritize speed over accuracy), metacognitive processes indicate adequate cognitive resource allocation to successfully achieve current objectives (Efklides, 2006, Koriat et al., 2006). Utilizing process models to study such a dynamic metacognitive process could also provide further insights into the underlying principles of how metacognition impacts response choice and response time.
Meta-reasoning processes are metacognitive processes associated with Reasoning tasks expected to take Longer Deliberation Time (RLDT) (e.g., see Box 2 in Ackerman & Thompson, 2017 for a list of such reasoning tasks). Meta-reasoning has been widely studied in both Cognitive Psychology and Artificial Intelligence (AI). In the Cognitive Psychology field, Meta-reasoning studies examine the evolution patterns of metacognition within a trial and how it informs metacognitive control strategies (Ackerman, 2014, Ackerman and Thompson, 2017, Dotan et al., 2018, Koriat et al., 2006, Metcalfe and Wiebe, 1987, Payne and Duggan, 2011, Rakefet Ackerman, 2017, Thompson and Johnson, 2014, Thompson et al., 2011). In the AI field, Meta-reasoning studies examine possible solutions to guide (multiple) agents to be rational about using available computational resources to (collectively) achieve a gain in the odds of success (Cox et al., 2022, Cox and Raja, 2011, Griffiths et al., 2019, Herrmann, 2023, Richardson and Ball, 2024). In Cognitive Psychology, theoretical Meta-reasoning frameworks, such as Metacognitive Reasoning Theory (Thompson, 2009, Thompson et al., 2011), Diminishing Criterion Model (Ackerman, 2014), and Cognizant Confidence hypothesis (Double & Birney, 2017) have been proposed to evaluate the impact of ongoing (or online) metacognition on response choice accuracy and time. However, further studies can be done to identify computational guidelines that may facilitate discussion, comparison, explanation, and evaluation of process models describing empirically observed Meta-reasoning phenomena. Such computational guidelines could also benefit AI reasoning models in evaluating the effectiveness of their reasoning strategy selection mechanism (e.g., Gao et al., 2024).
Given a mathematical state space, a random walk process transitions from a current state to a next state based on a probability function. The probability function is typically utilized to model a hypothesized underlying dynamic of an empirically observed process outcome. A decision-making random walk model utilizes the random process to calculate the likelihood of reaching specific states designated as response choices. The steps taken to reach a designated state are utilized to model response time given a response choice. Random walk models have been widely utilized to study computational underpinnings of decision processes, e.g., Decision Field Theory (Busemeyer & Johnson, 2008), Markov Random-walk formulation of Drift Diffusion Model (MR-DDM) (Busemeyer and Bruza, 2012, Busemeyer et al., 2006), Quantum Random-Walk Models (QRM) (Busemeyer and Bruza, 2012, Busemeyer et al., 2006, Chen et al., 2022), and hybrid models combining diffusion and quantum dynamics (Busemeyer, Zhang et al., 2020, Epping et al., 2023, Fuss and Navarro, 2013, Huang et al., 2025, Kvam et al., 2021).
Meta-reasoning studies usually instruct participants to respond based on their subjective decision-stopping criterion. Furthermore, an external (response triggered by an external cue, e.g., a beep) or an internal (response triggered by an internal cue, e.g., subjective confidence threshold) stopping procedure can be implemented in the decision random walk models (see Chapter 8 in Busemeyer & Bruza, 2012). Hence, an internally controlled random walk model could be utilized to describe empirically observed Meta-reasoning phenomena. Also, random walk models have been utilized to track expected confidence state at intermediate time intervals (Busemeyer & Bruza, 2012). A confidence trajectory could enable researchers to analyze the utility of subjective confidence evolution patterns as a decision-stopping criterion. For example, evaluate whether participants continue deliberation for a certain time period even with minimal improvement in confidence state, (e.g., The Diminishing Criteria Model proposed by Ackerman, 2014 meta-reasoning study), or continue deliberation until a sudden spike in confidence is experienced (e.g., sudden spike in empirical warmth rating observed immediately before a response in an insight task Metcalfe & Wiebe, 1987)? Hence, these models could potentially enable meta-reasoning researchers to find deeper insights into the impact of metacognition/confidence evolution patterns on response accuracy, especially in studies where explicit intermediate judgments were not feasible or desirable.
MR-DDM suggests that as a stimulus is administered, a cognitive process begins to collect evidence for each of the provided response choices. The evidence is gathered to separate the signal (correct) from the noise (incorrect) response option. Then, a response is generated when a satisfactory amount of evidence is collected for a given response choice. MR-DDM has also been extended to account for post-decisional reports of confidence judgment, such as the two-stage Dynamic Signal Detection model (Pleskac and Busemeyer, 2010, Yu et al., 2015), and the collapsing confidence boundary model (Moran et al., 2015). Recently, MR-DDMs have also been utilized to model RLDT response choice accuracy and time (Diederich & Oswald, 2016). Also, Malaiya (2025) PhD dissertation demonstrated the effectiveness of MR-DDM in estimating the expected confidence state in reasoning and planning trials with longer deliberation response times. Hence, MR-DDM can be further utilized to evaluate empirical Meta-reasoning phenomena in RLDT, especially in studies without explicit intermediate confidence judgments.
MR-DDM assumes that while evidence is being accumulated towards a choice, a participant is aware of the state of evidence, and the confidence judgments are readouts of the evidence status. In contrast, QRM suggests that a response (or state of evidence) is realized at the time when a participant decides to elicit a response. To model response choice at a given response time, a measurement operation is performed on the Quantum state of the random walk at the response time (Busemeyer et al., 2006). To model intermediate responses, such as intermediate confidence judgments, a measurement operation is performed at every intermediate response time. The intermediate measurement mechanism of QRM has been successfully utilized to model the impact of intermediate response choices on a final response choice (Busemeyer et al., 2019, Kvam et al., 2015). Also, QRM has been successful in modeling response choice accuracy, corresponding intermediate and post-decisional confidence judgments, and response times of fast two-choice decision tasks (Busemeyer and Bruza, 2012, Busemeyer et al., 2006, Chen et al., 2022, Kvam et al., 2015, Pothos and Busemeyer, 2022). Also, QRM has been successful in modeling causal reasoning tasks (Mistry et al., 2018, Trueblood and Busemeyer, 2012, Trueblood et al., 2017); (also see Bruza et al., 2015, Pothos and Busemeyer, 2022 for a review). Also, Malaiya (2025) PhD dissertation demonstrated the effectiveness of QRM-estimated confidence states in explaining response patterns in conflict-based and high-difficulty reasoning trials. Hence, QRM could be further utilized to evaluate meta-reasoning phenomena in complex RLDT, especially studies where explicit intermediate confidence judgments were not recorded.
Comments (0)