Negative self-referent memory bias (the preferential memory for negative self-referent information) is a well-known symptom of depression and a risk factor for its development, maintenance, and recurrence. Evidence shows its potential as an add-on tool in clinical practice. However, it is unclear which self-referent memory bias measure(s) could be clinically relevant. Here, as a first step, we investigate which measures best differentiate current depression status and track depressive symptom severity most closely. The total sample (N = 956) from three (naturalistic) psychiatric cohorts with matched controls was divided into a current depression, remitted depression, and non-disordered control group. Self-referent memory bias task measures were calculated and the drift diffusion model (DDM) was applied to assess underlying components of the cognitive self-referent decision making process. Measures were compared between groups and linear regression models were applied to assess their association with depressive symptom severity. The number of negative endorsed words differentiated best between depression status while a combination of the number of positive endorsed words, self-referent negative memory bias, and positive drift rate was most strongly associated with depressive symptom severity. Our results give direction to the clinical implementation of this task. Its value in assessing, monitoring, and predicting depressive state and trait in clinical settings requires further investigation.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Introduction
Negative self-referent memory bias refers to the better and more frequent memory for self-referent negatively valenced information compared to neutral or positive information. It is a well-known aspect of and risk factor for depression (Gotlib & Joormann, 2010; Beck & Bredemeier, 2016; Marchetti et al., 2018; LeMoult & Gotlib, 2019) that extends beyond current depressive episodes (Joormann & Aditte, 2015; Everaert et al., 2022). This makes it a characteristic of both state and trait depression and a valuable depression marker or predictor. Interestingly, it may be considered a potential cognitive marker for depressive symptoms across the psychopathological spectrum (Duyser et al., 2020). As well as a predictor for diverse psychiatric problems (Fleurkens et al., 2025).
Self-referent memory bias is generally assessed with a computer task where positive and negative words are presented and participants have to indicate how well those words describe them. During this endorsement phase the material is encoded and during the recall phase that follows, participants are asked to retrieve this information. This brief and easy to implement task delivers a large number of possible outcome measures. While this illustrates the flexibility of the task, it also complicates the compatibility of findings and potential translation to clinical practice as there is currently no standard or consensus in research about which outcome measures are used.
For example, the number or proportion of positive and/or negative words endorsed as self-referent is frequently used as a measure of depressotypic self-schema, because negative words like ‘worthless’ activate the negative dysfunctional believes depressed individuals often have about themselves (Dozois & Dobson, 2001; Moulds et al., 2007; Romero et al., 2014). The number or proportion of positive and/or negative words recalled is used to measure affective memory bias without the self-referent aspect (Vrijsen et al., 2015; Hakamata et al., 2022). However, most common is to calculate a self-referent memory bias index by combining information from the endorsement and recall phases, which can be done in different ways. The number of positive or negative endorsed and subsequently recalled words can be divided by the total number of endorsed words (Goldstein et al., 2015; Allison et al., 2021), which has the advantage that it controls for individual differences in endorsement rates. To correct for differences in recall rates, a division by the total number of recalled words is also possible. Or, to calculate an index of self-referent memory bias, the number of positive or negative endorsed and recalled words can be divided by the total number of endorsed and recalled words (Bradley & Mathews, 1983; Gotlib et al., 2004).
The reaction times (RTs) during the endorsement phase can also be an insightful outcome measure to infer information about the decision-making process. RTs are currently mostly ignored as a source of clinically relevant information while they could be easily implemented in e-health tools. The faster endorsement of negative words as self-referent has been proposed as a characteristic of depression (McDonald & Kuiper, 1985), although the findings have been inconsistent (Bradley & Mathews, 1983; Dozois & Dobson, 2001; Gotlib et al., 2004). Extensively studied in psychology, the dynamics of decision making can be examined with the drift diffusion model (DDM; Ratcliff & Rouder, 1998). The DDM decomposes task responses, reaction times (RTs), and their distribution into distinct components of decision making and information processing. These can be used to draw conclusions about the cognitive processes underlying self-referent decision making. The DDM parameters are therefore considered more implicit, mechanistic measures in comparison to the measures indicating how many positive or negative words were considered self-referent. The model and its parameters, including a schematic representation, are further explained in the Methods section.
One specific DDM parameter, the drift rate, reflects the rate of information accumulation towards one of the decision options. In case of a self-referent memory bias task, it indicates how quickly and easily someone decides to endorse or not endorse a word as self-referent. It has therefore been proposed as a proxy for self-schema activation (Dainer-Best et al., 2018; Allison et al., 2021; Parker & Adleman, 2021). Drift rate has excellent convergent validity with endorsement (Disner et al., 2017). It has been shown that the number of words endorsed as self-referent and negative drift rate were most strongly associated with subclinical depressive symptom severity in healthy samples (Dainer-Best et al., 2018). In remitted depressed individuals, drift rate has also been associated with negative attention bias (Nagrodzki et al., 2023). Importantly, as drift rate incorporates information about the speed of the decision process, it may be a more sensitive and implicit measure compared to the relatively explicit measure of the number of positive or negative words endorsed as self-referent. Despite this converging evidence, it is still unknown if different DDM parameters can also distinguish current depression status or how they are related to depressive symptom severity in clinical samples.
There is growing evidence for self-referent memory bias as an important depression marker. In addition, within-subject changes in self-referent memory bias might be an early marker for (pharmacological) treatment effects (Harmer et al., 2009, 2017; Terpstra et al., 2023) and thereby a predictor of the course of depression and depressive symptom severity. For example, stronger negative self-referent memory bias in individuals with remitted depression predicted the onset of new depressive episodes within the next three years (LeMoult et al., 2017). Relatedly, in depressed individuals, stronger positive self-referent memory bias was associated with greater symptomatic improvement nearly nine months later (Johnson et al., 2007). Recently a study in a large naturalistic psychiatric sample showed that more negative memory bias predicted more psychiatric problems three and four years later, even when baseline psychiatric problems and depression were controlled for (Fleurkens et al., 2025). Combining this evidence with the need for more objective, mechanism-based diagnostic strategies to complement the current subjective self-report diagnostic tools in psychiatry (Rosenberg, 2006; Insel et al., 2010), now seems the time to push for a translation of these research insights into clinical practice. For example, the self-referent memory bias task could function as an easy add-on diagnostic tool to assess depression (vulnerability) or monitor and predict depression symptom severity. It could also help target interventions such as Cognitive Bias Modification, where negative memory biases are modified with the goal to alleviate depressive symptoms (Arditte et al., 2018; Vrijsen et al., 2018).
With the current study, we aim to contribute to this future implementation of self-referent memory bias into healthcare, especially as add-on diagnostic tool. This requires an understanding of the different possible outcome measures and how they relate to depression diagnosis and symptoms. We therefore set out to investigate (1) how well the different self-referent memory bias outcome measures differentiate between current depression status (i.e., current depression, remitted depression, and no depression) and, (2) how strongly these outcome measures are associated with depressive symptom severity. Because psychiatric multimorbidity is more the rule than the exception (Kessler et al., 2006; Plana-Ripoll et al., 2020; Ten Have et al., 2023) and research findings from naturalistic psychiatric samples are more useful for clinical reality, we used a large, accumulated dataset of N = 956, including two naturalistic cohorts with different types of psychiatric multimorbidity and non-disordered controls. We thereby aim to increasing the generalisability of the findings.
Materials and Methods
Participants
This study uses data from 956 participants that were originally collected as part of three cohorts: two naturalistic psychiatric cohorts called ‘MIND-Set’ (n = 402; Van Eijndhoven et al., 2021) and ‘MATCH’ (n = 143; Koekkoek et al., 2016) from which we included the individuals with current and remitted depression, and a cohort of individuals with remitted depression named ‘Info in Genes’ (n = 411; Vrijsen et al., 2014). We also included the healthy controls from MIND-Set and Info in Genes (MATCH did not include healthy controls). The data were pooled and participants were divided into three groups: a current depression group (CD), a remitted depression group (RD), and a control group of individuals without current or past depression or other psychiatric diagnoses (no depression; ND). Due to the naturalistic nature of the MIND-Set and MATCH cohorts, one or more current comorbid psychiatric diagnoses were possible (see Table 1). A more detailed description and overview of each cohort can be found in the supplementary methods and Table S1, but specific information regarding e.g., the number of depressive episodes or onset was not available.
Table 1
Demographic information and current comorbid psychiatric diagnoses of the total sample and the depression subgroups (CD = current depression, RD = remitted depression, ND = no depression, i.e., non-disordered controls). SUD = substance abuse disorder, ADHD = attention-deficit/hyperactivity disorder, ASD = autism spectrum disorder. The group comparisons column shows the results of one-way ANOVA and chi-square tests comparing between CD, RD and ND
Total
CD
RD
ND
Group comparison
n
956
236
534
186
From MIND-Set
402
173
125
104
From MATCH
143
63
80
-
From Info in Genes
411
-
329
82
Age (mean, SD)
42.5 (13.6)
40.4 (13.9)
43.9 (13.0)
40.8 (14.3)
F(2,928) = 6.949, p <.001
Gender (% female)
59.8
55.5
61.6
60.2
X2(2) = 2.550, p =.279
Education levela
Low (%)
Middle (%)
High (%)
18.1
31.0
50.9
31.4
36.
31.8
16.8
31.9
51.3
4.9
21.1
74.1
X2(2) = 86.425, p <.001
Comorbid anxiety (%)
19.4
40.7
16.7
-
Comorbid SUD (%)
10.9
26.3
7.9
-
Comorbid ADHD (%)
10.8
16.9
11.8
-
Comorbid ASD (%)
8.5
14.0
9.0
-
a Education level is the highest education someone finished with a diploma and is calculated conform Stronks et al. (2013)
Diagnoses were determined by trained clinicians using validated and reliable guided diagnostic interviews that are frequently used in Dutch clinical practice. The Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I; First et al., 2002; Lobbestael et al., 2011) was used to diagnose depression and anxiety in the MIND-Set and Info in Genes cohorts. The Measurements in the Additions for Triage and Evaluation and Criminality (MATE-Crimi for DSM-IV; Schippers et al., 2011), the Diagnostic Interview for Adult ADHD (DIVA for DSM-IV; Kooij, 2010, Ramos-Quiroga et al., 2019), and the Dutch Interview for Diagnosing Autism Spectrum Disorders (NIDA for DSM-V; Vuijk, 2016, Vuijk et al., 2022) were used in the MIND-Set cohort to diagnose substance use disorder, ADHD, and ASD, respectively. The Mini International Neuropsychiatric Interview Plus (MINI Plus; Sheehan et al., 1997) was used for diagnosing the MATCH cohort.
Self-Referent Encoding Task
Task Description
In each cohort, self-referent memory bias was measured with an implicit learning computer task (Derry & Kuiper, 1981; Dobson & Shaw, 1987) consisting of three parts: (1) an endorsement phase, (2) a distraction task (Raven matrices or a symbol substitution task), and (3) a recall phase, see Fig. 1. During the endorsement phase, participants had to indicate on a five-point scale (in the MIND-Set study) or with ‘yes’ or ‘no’ (in the MATCH study) if each word described them. In the Info in Genes study, participants were asked to vividly imagine themselves in a scene with each word and then had to indicate on a five-point scale how well they were able to do so. Words yielding a ‘yes’ response or a score of 4 or 5 on the five-point scale were considered to be endorsed as self-referent. These are small implementation differences that still allow comparison between outcome measures. Positive and negative words had the same average length and the level of Dutch was similar for all words.
Fig. 1
Schematic representation of the Self-Referent Encoding Task across the different cohorts. During the self-referent endorsement phase, participants had to indicate for each word if (MATCH) or how well (MIND-Set) the word described them or they had to indicate how well they were able to vividly imagine themselves in a scene with that word (Info in Genes). The words were presented in Dutch, but English translations are provided here. ‘Gezellig’ is a typically Dutch word that cannot be translated into English, but is closest to ‘cosy’.
×
After the two-minute non-verbal distraction task, the recall phase immediately started. Participants were given three minutes to type in all the words they remembered from the endorsement phase. Guessing was encouraged and typographical errors were allowed as long as the intended word could unambiguously be recognised (e.g., “healthy” and “haelthy”). To account for primacy and recency effects, the first two and last two words were not used in the calculation of the outcomes that included the recall phase (following e.g., Gerritsen et al., 2011; Van Oostrom et al., 2012; Vrijsen et al., 2017).
Task Outcomes
For this study, we calculated and compared a broad selection of outcome measures from the self-referent memory bias task based on those used in literature, described in the Introduction. They were divided into outcome measures using data from the endorsement phase only (i.e., the number or proportion of positive or negative words endorsed as self-referential), the recall phase only (i.e., the number or proportion of positive or negative words recalled), or both phases (i.e., self-referent memory bias scores), see Table 2.
Table 2
The name, description, and range of the different self-referent memory bias task outcome measures
Variable
Description
Range
Endorsement
Positive words endorsed
The number of positive words that were endorsed (rated 4 or 5)
0–12
Phase
Negative words endorsed
The number of negative words that were endorsed (rated 4 or 5)
0–12
Negative endorsement scorea
The number of negative words endorsed divided by the total number of words endorsed
0–1
Recall phase
Positive words recalled
The number of positive recalled words
0–10b
Negative words recalled
The number of negative recalled words
0–10b
Negative recall scorea
The number of negative words recalled divided by the total number of words recalled
0–1
Self-referent
negative
Self-referent negative memory bias variation 1a
The number of negative words endorsed and recalled divided by the total number of words endorsed and recalled
0–1
Memory bias
Self-referent negative memory bias variation 2
The number of negative words endorsed and recalled divided by the total number of words endorsed
0–1
Self-referent negative memory bias variation 3
The number of negative words endorsed and recalled divided by the total number of words recalled
0–1
a The positive and negative scores are complementary with a maximum of 1, so a negative score of 0.33 automatically results in a positive score of 0.66. We therefore only included the negative scores. b The first two and last two words were excluded in order to prevent primacy and recency e ffects, resulting in a range of 0–10 rather than 0–12
Additionally, when a self-referent memory bias task has non-binary answer options, such as the five-point scales used in the MIND-Set and Info in Genes cohorts (Fig. 1), different cut-offs for when words are endorsed as self-referential can be (and are) used. Responses of 4 and 5 generally count as endorsed, but responses of 2 and higher have also been used to indicate endorsement (Stalmeier et al., 2021). For completeness and to explore different cut-offs, we also calculated more liberal measures where responses of 2 or higher were considered as endorsed as well as stricter measures where only a 5 counted as endorsed, see supplementary Table S2.
Depression Symptom Severity
Depression symptom severity data were available from the MIND-Set and Info in Genes cohorts. The Inventory of Depressive Symptomatology– Self Rating questionnaire (IDS-SR; Rush et al., 1996) and the Beck Depression Inventory (BDI-II; Beck et al., 1996) were used, respectively. Total scores were transformed into z-scores in order to conduct analyses across cohorts.
Drift Diffusion Model Parameters
The drift diffusion model (DDM) is a commonly used computational model that breaks down the dynamics of decision making into different parameters (Ratcliff & Rouder, 1998). It uses task responses, reaction times (RTs), and RT distributions as input and delivers several distinct parameters (listed below) that capture the cognitive process of information processing and decision making. The DDM is increasingly often applied to the self-referent memory bias task to provide a more mechanistic measure of self-referent decision making as opposed to the rather explicit measure of the number of positive or negative words endorsed as self-referent. Figure 2 schematically illustrates the DDM and its parameters.
Beginning at the relative starting point (zr), evidence is accumulated until one of the two decision boundaries, in this case endorsing or not endorsing a word as self-referent, is met. The relative starting point represents an initial bias towards one decision over the other. The drift rate (v) reflects how quickly and strongly a decision is made, and therefore how easy it is to endorse a word as self-referent or not. It is considered a proxy of self-schema activation (Dainer-Best et al., 2018; Disner et al., 2017). The threshold separation (a) is the distance between the two boundaries and indicates the amount of evidence that is required before making a decision. The response time constant (t0) is the time used for all non-decisional processes like response execution, and the response time difference (d) is the difference in how fast the response was executed. The relative starting point (zr) and drift rate (v) were computed for each valence (positive and negative) separately, because negative and positive bias represent related but distinct clinically-relevant outcomes (Everaert et al., 2022).
All parameters were computed with fast-dm (Voss & Voss, 2007) using the words, responses (endorsed or not endorsed as self-referential), RTs (time between start of response window and entering response), and valences as input. Following Voss et al. (2004), RTs below 300 ms and above three times the interquartile range were discarded. Since the mean RTs of the first two presented words were significantly higher than all other words, most likely due to participants familiarising themselves with the task, they were also discarded, resulting in a total of 22 trials (11 of each valence). The parameters were transformed to z-scores to conduct analyses across cohorts.
Fig. 2
A schematic representation of the drift diffusion model (DDM). The decision process starts at the relative starting point (zr), which represents the initial bias towards either decision option. Evidence is then accumulated at drift rate (v) until one of the decision boundaries is reached. The threshold separation (a) is the distance between the decision boundaries, the response time constant (t0) is the time spent on all non-decisional processes, and (d) is the difference in how fast the response was executed
×
Data Analyses
Data were analysed using IBM SPSS Statistics version 27 and RStudio 1.1463. We conducted our analyses in several subsamples because some outcome measures were not available for the full sample. Supplementary Table S1 provides an overview of each subsample.
For the question ‘how well do the different self-referent memory bias task outcome measures differentiate depression status?’, the whole dataset (N = 956) was used. Differences in outcome measures between depression status were assessed with ACOVAs using gender, age, education level, and cohort as covariates. Post-hoc Tukey tests were performed in case of a significant group comparison. The same analyses were performed on a set of more exploratory outcome measures that could only be calculated on the data from the MIND-Set and Info in Genes cohorts (n = 811).
For the next question, ‘how well do the DDM parameters differentiate depression status?’, differences in DDM parameters were assessed with ACOVAs also using gender, age, education level, and cohort as covariates. Post-hoc Tukey tests were performed again when the group comparisons were significant. A subsample of 629 participants was used for these analyses. The fast-dm programme required at least 10 trials of each valence. Because some trials were discarded and there were 11 usable trials of each valence to start with, the parameters could be computed for 630 participants. One participant was excluded because of poor model fit (p <.05).
The question ‘how are the different self-referent memory bias task outcome measures related to depression symptom severity?’ was assessed the same subsample of 811 participants from the MIND-Set and Info in Genes cohorts as the first question. This was because depression symptom severity was not assessed in the MATCH cohort. Linear regression models including gender, age, education level, and current depression status as covariates were used to assess the relationship between each self-referent memory bias task outcome measure and depression symptom severity.
For the fourth question, ‘how do the different DDM parameters relate to depressive symptom severity?’, we used a subsample of 545 participants. These were the participants from the subsample of 629 participants from the second question, excluding those from the MATCH cohort because no depression symptom severity data were available from that cohort. We assessed the relationships between the DDM parameters and depression symptom severity with linear regression models that included gender, age, education level, and current depression status.
Finally, in order to draw conclusions about which outcome measure or combination of measures is best able to distinguish between depression status and/or has the best predictive value for depression symptom severity, we ran a multivariate ANCOVA and a hierarchical linear regression analysis with the outcomes from the previous analyses that showed the largest explained variance. Age, gender, education level, and study were again also added to the model. The hierarchical linear regression model also included depression status.
Results
How Well Do Different Self-Referent Memory Bias Task Outcome Measures Differentiate Current Depression Status?
The results from the group- and pairwise comparisons are presented in the first two columns of Table 3 as well as visually in Fig. 3. The current depression, remitted depression, and healthy control group differed significantly on all outcome measures. The pairwise comparisons showed that all three outcome measures from the endorsement phase differed significantly between all groups. The number of negative endorsed words explained most variance (i.e., large effect size ƞ2 = 0.237).
Fig. 3
The mean, standard deviation, minimum and maximum values, and probability density of the raw data points of the different self-referent encoding task outcome measures, including comparisons of the never depressed healthy controls (ND), currently depressed (CD), and remitted depressed (RD) groups. *** = p The mean, standard deviation
×
Table 3
The group comparisons column shows the results from the ANCOVAs testing for differences in self-referent memory bias task outcome measures (see table 2 for their descriptions) between the never-depressed healthy controls (ND), current depressed (CD), and remitted depressed (RD) individuals. Gender, age, education level, and cohort (MIND-Set, MATCH, or Info in genes) were included as covariates. The pairwise comparisons column shows the results of the post-hoc Tukey tests. These analyses were performed in the total sample (N tests. These analyses were performed in the total sample (trols (ND), current depressed (CD), and remitted depressed (RD) individuals. Gender, age, education level, and cohort (MIND-Set, MATCH, or Info in genes) n status as covariates in the models. These analyses were performed in a subsample of the MIND-Set and Info in Genes cohorts (n= 811)
Group comparisons
Pairwise comparisons
Linear regression
F
p
ƞ2
ND/CD
ND/RD
CD/RD
R2
β
t
p
Positive endorsed
78.427
< 0.001
0.146
< 0.001
< 0.001
< 0.001
0.383
− 0.470
-15.638
< 0.001
Negative endorsed
142.070
< 0.001
0.237
< 0.001
< 0.001
< 0.001
0.223
0.215
5.962
< 0.001
Negative endorsement score
101.817
< 0.001
0.185
< 0.001
< 0.001
< 0.001
0.332
0.429
12.962
< 0.001
Positive recalled
4.874
0.008
0.011
< 0.001
< 0.001
0.466
0.209
− 0.162
-4.729
< 0.001
Negative recalled
5.343
0.005
0.012
0.073
0.350
0.428
0.188
-0.43
-1.262
0.207
Negative recall score
12.090
< 0.001
0.026
< 0.001
< 0.001
0.287
0.183
0.80
2.407
0.016
Self-referent negative memory bias 1
60.255
< 0.001
0.125
< 0.001
< 0.001
0.544
0.254
0.273
7.755
< 0.001
Self-referent negative memory bias 2
20.859
< 0.001
0.044
0.524
<. 001
< 0.001
0.205
0.144
4.320
< 0.001
Self-referent negative memory bias 3
23.161
< 0.001
0.049
< 0.001
< 0.001
0.005
0.193
0.137
3.875
< 0.001
Not all pairwise group comparisons were significant for the three recall phase outcome measures. The number of positive recalled words and the negative recall score were able to distinguish current and remitted depressed individuals from healthy controls. This seems to indicate that not necessarily a preferential recall of negative information, but rather decreased recall of positive information appears to be related to depression (vulnerability). The three groups did not differ significantly in the total number of recalled words, F(2,917) = 0.010, p =.990, ƞ2 < 0.001 meaning that differences in recall or self-referent memory bias were not due to differences in general memory performance.
The three variations of negative self-referent memory bias (Table 2) each appear to have their own strength. The most commonly used first variation (i.e., the number of endorsed and recalled negative words divided by the total number of endorsed and recalled words) explained most variance, as shown by the medium-to-large effect size (ƞ2 = 0.125). It only distinguished current and remitted depressed individuals from healthy controls, indicating that it could represent a depression trait measure. The second and third variations (i.e., the number of endorsed and recalled negative words divided by the total number of endorsed or recalled words) had small effect sizes, but were able to differentiate between current and remitted depression and remitted depression and healthy controls (variation 2) or between all three groups (variation 3). Since these three variations perform differently in differentiating depression status, the choice depends on the intended (specificity) of the use. However, two potential pitfalls associated with these measures should be mentioned.
First, as shown by for example Fig. 3F, the majority of the never-depressed healthy controls and more than a third of the currently and remitted depressed individuals either had a negative self-referent memory bias index of zero or the score could not be calculated at all. This was due to a low number of negative endorsed and subsequently recalled words. This zero-inflatedness (i.e., an overly large proportion of zeroes) in especially the never-depressed healthy controls means that there is (too) low variability in healthy samples. Still, as shown, this index remains useful to distinguish individuals without depression or other psychiatric diagnoses from remitted and currently depressed individuals.
Second, interestingly, none of the groups had an absolute mean negative self-referent memory bias index. Given the relative nature of the index (i.e., a negative self-referent memory bias index of 0.33 automatically means a positive self-referent memory bias index of 0.67), only scores higher than 0.50 indicate a relative negative bias. However, none of the mean self-referent memory bias scores exceeded this. This indicates that a lack of memory processing of positive self-referent information could be a characteristic of depression (vulnerability) instead of an increased memory processing of negative self-referent information, which is in line with an extensive and recent meta-analysis by Everaert and colleagues (2022).
How Well Do the DDM Parameters Differentiate Current Depression Status?
Every DDM parameter except the difference in response execution (d) and the response time constant (t0) differed significantly between the three groups, Table 4. This indicates that depression state is related to the self-referent decision-making process itself and not the non-decision-making components of the task such as how long it took to press the buttons to answer. All effect sizes were small-to-medium and the positive drift rate (vpositive) explained most variance. Post-hoc pairwise comparisons showed that the parameters were able to distinguish currently and remitted depressed individuals from never-depressed healthy controls.
Table 4
The group comparisons column shows the results from the ANCOVAs testing for differences in drift diffusion model parameters between the groups with gender, age, education level, and cohort (MIND-Set, MATCH, or Info in genes) as covariates. The pairwise comparisons column shows the results of the post-hoc Tukey tests for significant group comparisons. These analyses were formed in a subsample of 629. The third to sixth columns present the results from the linear regression analyses testing the associations with depression symptom severity which included gender, age, education level, and current depression status in the models. These analyses were performed in a subsample of 545
Group comparisons
Pairwise comparisons
Linear regression
F
p
ƞ2
NC/CD
ND/RD
CD/RD
R2
β
t
p
vpositive
16.383
< 0.001
0.052
< 0.001
< 0.001
0.561
0.296
− 0.188
-4.944
< 0.001
vnegative
10.406
< 0.001
0.034
< 0.001
< 0.001
0.200
0.278
0.129
3.377
0.001
zrpositive
8.547
< 0.001
0.028
0.001
< 0.001
0.926
0.271
− 0.097
-2.526
0.012
zrnegative
6.475
0.002
0.021
0.001
0.019
0.259
0.267
0.070
1.835
0.067
a
4.886
0.008
0.016
0.017
0.034
0.695
0.267
0.067
1.755
0.080
d
0.037
0.963
< 0.001
-
-
-
0.263
0.032
0.850
0.396
t0
1.636
0.196
0.005
-
-
-
0.262
0.007
0.190
0.849
Partial correlation analyses controlling for gender, age, education level, and cohort, showed moderate significant correlations between the positive drift rate and the number of positive endorsed words, r =.308, p, < 0.001, as well as between the negative drift rate (vnegative) and the number of negative endorsed words, r =.285, p <.001. The DDM uses the number of positive and negative endorsed words as direct input to compute the drift rates. However, these correlations indicate that drift rate captures a process distinct from endorsement. Rather than relatively explicitly measuring how positive and negative someone thinks about themself, i.e., by actively endorsing positive and negative self-descriptive words, drift rate may implicitly capture the underlying positive and negative self-schemas (Dainer-Best et al., 2018; Disner et al., 2017). There were no significant partial correlations between the number of positive recalled words and the positive drift rate, r =.012, p =.779, or the number of negative recalled words and the negative drift rate, r =.073, p =.096.
How are the Different Self-Referent Memory Bias Task Outcome Measures Related to Depression Symptom Severity?
As is evident from Table 3, all outcome measures from the endorsement phase and the negative self-referent memory bias indices were significantly associated with depression symptom severity. Because depression status was included as covariate in these regression models, the associations existed independent of current depression status. Interestingly, we showed before that the number of negative endorsed words most strongly differentiated the three diagnostic groups, whereas most variance in depression symptom severity was explained by the number of positive endorsed words. From the recall phase, both the number of positive recalled words and the negative recall score were significantly associated with depression symptom severity, while the number of negative recalled words was not.
How Do the Different DDM Parameters Relate to Depression Symptom Severity?
The positive and negative drift rate and the positive relative starting point (zrpositive) were significantly associated with depression symptom severity, independent of depression status. The positive drift rate explained most variance (R2 = 0.296), which is in line with the results in Sect. 3.2. It also matches the finding of Dainer-Best and colleagues (2018) who showed in three large, healthy samples that from all DDM parameters, drift rate was most robustly associated with depression symptoms.
The Optimal (Combination of) Outcome Measure(s) to Differentiate Depression Status and Possibly Predict Depression Symptom Severity
As a final step and to accommodate outcome measure selection within the clinical setting, we examined which outcome measure(s) is/are best suitable to distinguish between depression status. The measures with the highest explained variance in their own category (i.e., the endorsement phase, the bias indices, and the DDM (see 3.1 and 3.2)) were selected. This resulted in the ‘classic’ outcome measures, i.e., the number of negative endorsed words and the negative self-referent memory bias index (variation 1). We tested whether the positive drift rate had any additional value. MANCOVAs with combinations of these three outcome measures were performed with gender, age, education level, and cohort (MIND-Set, MATCH, and Info in Genes) as covariates. The N = 629 subsample from the DDM analyses was used.
In Table 5, the models are ranked by their explained variance. The model only including the number of negative endorsed words explained the highest variance in differences between depression status, with a large effect size, and the bias index or DDM parameter did not have additional value. The number of negative endorsed words therefore seems the most promising outcome measure to investigate further in a clinical setting.
Table 5
Results from the MANCOVAs testing for differences in depression status (no depression, i.e., healthy controls, current depression, or remitted depression) including gender, age, education level, and cohort (MIND-Set, MATCH, and Info in genes) as covariates. These analyses were performed in a subsample of 629
To test which (combination of) outcome measure(s) was most strongly associated with depression symptom severity, we performed a hierarchical linear regression model using the measures that showed the largest explained variance in the analyses in 3.3 and 3.4 from the same three categories (endorsement phase, bias index, and DDM). These were the number of positive endorsed words, the negative self-referent memory bias index (variation 1), and the positive drift rate. Gender, age, education level, and depression status were included as covariates. The N = 545 subsample from the DDM analyses, excluding the MATCH cohort (due to no depression symptom severity measure) was used.
The model including all three measures explained most variance, F(7,466) = 36.873, p <.001, R2 = 0.356. The change statistics showed that every additional measure contributed to the model significantly: Fchange(1,467) = 14.344, p <.001, R2change = 0.020 for the bias index and Fchange(1,466) = 10.192, p =.002, R2change = 0.014 for the positive drift rate. Adding the number of positive recalled words to ensure that all outcome measure categories from the tasks are included, increased the explained variance further, F(8,465) = 33.203, p <.001, R2 = 0.364, which was a significant change, Fchange(1,465) = 5.192, p =.023, R2change = 0.007. Even though some of these measures use similar input (e.g., the number of positive endorsed words and the positive drift rate or the number of positive recalled words and the bias index), there was no multicollinearity (VIF < 1.9 for all). When using the self-referent encoding task to index depression symptom severity, combining the information from all outcome measure categories seems most informative and would be worth investigating further in clinical practice.
Discussion
We set out to investigate (1) how well different commonly used self-referent memory bias outcome measures differentiate current depression status (i.e., currently depressed and remitted depressed individuals, and never-depressed healthy controls) and, (2) how strongly these outcome measures are associated with depression symptom severity. We further tested which (combination of) outcome measure(s) best distinguished between depression status and had the largest statistically predictive value for depression symptom severity. Our findings, further discussed below, give direction to the clinical implementation of self-referent memory bias. We also discuss its possible roles in assessing, monitoring, and predicting depressive state and trait, but these require further investigation in clinical, longitudinal studies.
We found that the number of negative endorsed words was best able to differentiate all three depression groups while the number of positive endorsed words showed the strongest (negative) association with depressive symptom severity. When combining the best outcome measures from three categories (i.e., the endorsement phase only, self-referent memory bias index, and the DDM), the number of negative endorsed words remained the single best measure to distinguish between depression status. A combination of the number of positive endorsed words, the negative self-referent memory bias index, and the positive drift rate showed the strongest association with depressive symptom severity (i.e., explained most variance). Selecting the optimal (combination of) outcome measure(s) thus depends on the aim and sample, i.e., measuring state and/or trait depression.
When the aim is to differentiate currently depressed individuals, individuals with remitted depression, and never-depressed healthy controls, self-referent processing was most useful, specifically the number of negative endorsed words. This is in line with a wealth of research showing that depressed individuals are more likely to endorse negative words as self-referent compared to non-depressed individuals (i.a., Dobson & Shaw, 1987; Gotlib et al., 2004; Everaert et al., 2022). The variations of the self-referent memory bias index differed in their ability to differentiate between depression status and had varying effect sizes. As we described earlier, the choice for one over the other depends on the sample and intended goal. A relevant issue for all indices was the low number of negative endorsed and recalled words. Although we based the selection of 12 words of each valence on previous work, more recent studies, especially those also applying the DDM, use a minimum of 24 words of each valence (Hsu et al., 2020; Beevers et al., 2023; Castagna et al., 2023; Terpstra et al., 2023). We recommend carefully considering the number of words when setting up a self-referent memory bias task; a shorter task might be more practical, while a longer task is more suitable when applying the DDM. While the positive and negative drift rate and relative starting point were able to differentiate (remitted) depressed individuals from never-depressed healthy controls, they did not have additional value on top of the number of negative endorsed words. This simple task holds promise to be used as an add-on diagnostic tool in clinical settings to assess current depression and depression trait in a more objective way than self-report questionnaires. This more objective measure is also useful in situations where individuals have difficulty to adequately voice or recognize their symptoms in diagnostic interviews or on self-report questionnaires.
When the aim is to monitor or predict depressive symptom severity, for example during pharmacological or cognitive interventions, a broader set of outcome measures, i.e., the number of positive endorsed words, the negative self-referent memory bias index, and the positive drift rate, showed most promise to be investigated further in clinical settings with a longitudinal study set-up. Interesting is that while depression (vulnerability) is often associated with more negativity across different domains (e.g., attention, interpretation, memory), we here found that less positivity was most strongly related to depressive symptom severity. This is in line with the broader literature on depression and positive and negative affect. While depression and anxiety are both characterised by high negative affect, the lack of positive affect is uniquely related to depression. This is seen in symptoms like anhedonia (the reduced motivation or ability to experience pleasure) and reflected by less positive expectations about the future (Beck et al., 2006; Miranda et al., 2008). Depressed individuals are also less capable in using positive memories to improve negative mood (Joormann & Siemer, 2004; Joormann et al., 2007; Silton et al., 2020). In addition, positivity has a protective effect; being able to use positive emotional words to describe a sad memory predicted improved depressive symptom severity six months later and was also related to a shorter depression recovery time (Brockmeyer et al., 2015). Positivity also works as a buffer to protect individuals from the impact of stress and other negative experiences (Riskind et al., 2013; Speer & Delgado, 2017; Egan et al., 2024), which is only the case for depression and not for anxiety. Self-referent positive information processing could therefore be a valuable clinical marker to assess and monitor changes in depression symptom severity. In addition, self-referent positive information processing makes a promising intervention target. Specifically, there is growing evidence that therapies augmenting the ability to focus on positive appraisal and positive memories, such as cognitive bias modification (CBM), can indeed function as add-on treatment or prevention tool (Becker et al., 2015; Bovy et al., 2022; Dalgleish & Werner-Seidler, 2014; Vrijsen et al., 2018), although there are also concerns about CBM’s effectiveness (Cristea et al., 2015; Fodor et al., 2020) and further research is necessary.
There are some considerations to be made related to the feasibility of using self-referent memory bias in clinical practice. Our findings match and expand upon previous work (Dainer-Best et al., 2018; Hitchcock et al., 2023). The positive and negative drift rates (vpositive and vnegative) and the positive relative starting point (zrpositive) were related to depressive symptom severity and able to differentiate current depression status. The positive drift rate was the only DDM parameter included in the optimal combination of outcome measures for the association with depressive symptom severity. It is important to recognise that while the ‘classic’ outcome measures from the self-referent encoding task are easy to calculate (Table 2), extracting the DDM parameters requires a separate computer programme with particularly formatted input, making it less user-friendly and implementable. Easy to use clinical add-on tools for electronic health monitoring and prediction, utilising computational modelling and artificial intelligence, are one of the goals for the near future to enhance personalised mental health care. In the meantime, one should evaluate whether the time and effort to calculate the DDM parameters is worth the relatively low additional value in statistically predicting symptom severity.
Relatedly, although theoretically and conceptually the endorsement measures and the DDM parameters are expected to tap into underlying dysfunctional memory schemas, these operationalisations are not directly representing a memory test, unlike the self-referent memory bias indices, which also includes recall. Both from a theoretical stance (e.g., Beck’s generic cognitive model; Beck & Bredemeier, 2016) and based on empirical data showing that self-referent memory bias is predictive of depression status and symptom severity (Johnson et al., 2007; LeMoult et al., 2017), the combination of self-referent endorsement and recall in the parameter holds promise for clinical application. Our findings, as well as other recent findings (e.g., Fleurkens et al., 2025) further support this.
This study has certain strengths and limitations. A strength is the large, accumulated dataset of a remitted depressed individuals and two naturalistic psychiatric patient samples, reflecting diverse clinical reality where psychiatric multimorbidity is common (Kessler et al., 2006; Plana-Ripoll et al., 2020; Ten Have et al., 2023). Diagnoses were determined using validated diagnostic interviews by trained clinicians and information about comorbid anxiety, substance use disorder, ADHD, and ASD was available, although details are missing. Other strengths are the use of a large number and variety of self-referent memory bias task outcome measures and the application of the DDM, which use to represent self-referent processing has seen a rise in popularity recently as it provides a more objective, mechanistic measure of self-referent decision making during the endorsement of positive and negative words. A limitation is that the binary DDM was applied to multialternative decision-making processes in the MIND-Set and Info in Genes versions of the self-referent memory bias task by categorising the responses into endorsed versus not endorsed as self-referent. However, the DDM model fit as applied with fast-dm was good (all p >.05 except one participant who was excluded) and the results are in line with previous findings, indicating that the DDM could be successfully applied to different versions of the self-referent memory bias task. Mathematically, multialternative DDMs have been developed (Roxin, 2019), but accessible tools will first have to be created before such new models can be practically applied. In the meantime, it seems that the DDM provides additional useful clinical information on the mechanisms underlying emotional processing (Nagrodzki et al., 2023). Another important limitation is that due to the cross-sectional study design of the available datasets, we were only able to investigate how the different outcome measures differentiated depression status and how they were associated with depressive symptom severity. We were not able to, for example, look at which outcome measure had the best predictive value for depression relapse or symptom improvement. This requires further research in similar large, naturalistic psychiatric samples.
Acknowledgements
N/A
Declarations
Ethics Approval and Consent for Participation and Publication
The authors have no relevant financial or non-financial interests or other conflicts of interest to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Met BSL Psychologie Totaal blijf je als professional steeds op de hoogte van de nieuwste ontwikkelingen binnen jouw vak. Met het online abonnement heb je toegang tot een groot aantal boeken, protocollen, vaktijdschriften en e-learnings op het gebied van psychologie en psychiatrie. Zo kun je op je gemak en wanneer het jou het beste uitkomt verdiepen in jouw vakgebied.