Introduction
Worry is a form of repetitive negative thinking (McEvoy et al.,
2019a) that is conceptualised as a mental process involving attempts to plan and prepare a favourable solution in the face of an uncertain and potentially negative outcomes (Borkovec,
1994; Fresco et al.,
2002). Though worry is experienced to some extent by everyone, excessive, multi-focal, and difficult to control worry is a cardinal feature of generalised anxiety disorder (GAD). In GAD, worry is typified by statements that imply catastrophising or inflexible rule-bound interpretations (Dugas et al.,
1998; Molina et al.,
1998). Worry content centres around life domains such as family, interpersonal relationships, finances, personal health, health of loved ones, work, education, everyday tasks, the world and society, and is relatively stable over time (Constans et al.,
2002). In particular, worry themes tend to be related to areas of high personal value (Boehnke et al.,
1998).
The Penn State Worry Questionnaire is a 16-item self-report questionnaire that is one of the most widely used measures of excessive worry (McEvoy et al.,
2019b). The PSWQ has demonstrated good construct validity, internal consistency, and test-retest reliability (Brown et al.,
1992; Dear et al.,
2011; Hazlett-Stevens et al.,
2004; Meyer et al.,
1990). Despite these favourable psychometric properties, there has been inconsistent findings in relation to factor structure. The PSWQ was originally found to have a unidimensional factor structure in both undergraduates (Meyer et al.,
1990) and in a heterogenous anxiety disorder sample (Brown et al.,
1992). In contrast, Fresco et al. (
2002) found that a two-factor solution provided superior fit when compared to a unidimensional solution in undergraduate students, with these two factors comprising of (1) worry engagement and (2) absence of worry. This same study found that higher order and lower order factors, general worry and worry engagement respectively, explained the majority of variance in symptom measures (Fresco et al.,
2002). Thus, the second factor, absence of worry, appeared to represent methodological variance, as the items loaded onto this factor were negatively worded (i.e.,
‘I do not tend to worry about things’) and thus reverse scored, rather than representing a conceptually orthogonal construct. Further investigation in a mixed sample of people with anxiety and mood disorders found that a unidimensional model accounting for method effects (i.e., covariance among errors of reverse-scored items) provided significantly better fit compared to the two-factorsolution (Brown,
2003). Therefore, it appears that the negatively phrased items may have an aberrant function producing an artificial factor, creating a limitation of the 16-item PSWQ. These findings also parallel literature that discourages the intermixing of negatively worded items in scales as they create psychometric artefacts, rather than reducing respondent bias (Chyung et al.,
2018; Roszkowski & Soven,
2010), with evidence indicating that reverse-scored items hinder psychometric performance in other anxiety disorder questionnaires (Rodebaugh et al.,
2007).
In light of the inconsistent factor structure of the full PSWQ, the 8-item PSWQ-Abbreviated (PSWQ-A) was constructed by Hopko et al. (
2003). The PSWQ-A was developed from the full PSWQ items that were only positively worded (i.e.,
‘I worry all the time’) to address factorial redundancy as well difficulties that had been previously reported for older adults with answering negatively worded questions due to increased cognitive load. Hopko et al. (
2003) demonstrated that the PSWQ-A has strong fit indices, good convergent validity, high internal consistency, and adequate test–retest reliability in a sample of older adults with a principal or co-principal diagnosis of GAD. Wuthrich et al. (
2014) went on to reproduce this methodology and compare the PSWQ to the PSWQ-A with a clinical sample of older adults with depression and an anxiety disorder. Wuthrich et al. (
2014) again found poor fit across absolute and incremental fit indices for both the one-factor and two-factor models for the 16-item PSWQ, whereas the unidimensional PSWQ-A was found to have good fit across all indices. Further, the PSWQ-A was found to have good construct validity and internal consistency, with adequate test-retest reliability. In addition to the PSWQ-A, another 3-item version of the PSWQ has also been developed by Berle and colleagues (
2011). The PSWQ-3 was created to incorporate the essential features of pathological worry (i.e., high frequency, high uncontrollability, multiple worry domains) as well as to exclude any reverse-scored items (Berle et al.,
2011). Psychometric properties of the PSWQ-3 were explored with a mixed sample of adults with a principal anxiety or related disorder, and the results indicated similar psychometric properties for the 16-item PSWQ and PSWQ-3. Berle et al. (
2011) found comparable construct validity, as well as sensitivity to treatment for individuals with GAD with equivalent large effect sizes following individual cognitive-behaviour therapy (CBT). Participants with GAD also scored higher than participants with another principal anxiety disorder on both the PSWQ and PSWQ-3.
Kertz et al. (
2014) aimed to compare all three versions of the PSWQ in a heterogeneous clinical sample (i.e., mood, anxiety, trauma-related, and psychotic disorders) of adults presenting for treatment in a partial hospital setting. The underlying factor structure of the 16-item PSWQ was represented by a two-factor model, although, when compared to the PSWQ-A, the PSWQ-A exhibited the best fit across all indices (i.e., SB χ
2, RMSEA, CFI, GFI and SRMR). In addition, all three versions demonstrated similar psychometric properties, with good construct validity for the briefer versions and excellent internal consistency for the full PSWQ. All three versions showed comparable sensitivity to treatment, demonstrating medium effects following an assorted program of group and individual therapy informed by cognitive behaviour therapy. In adults with GAD, only Dear et al. (
2011) has compared four model iterations of the 16-item PSWQ (i.e., unidimensional, two-factor, one-factor with method effects, and a three-factor model) and investigated the psychometric properties in adults with GAD. Interestingly, Dear et al. (
2011) found a three-factor solution, with all items loading onto one general factor as well as two separate method factors (i.e., absence of worry and worry engagement), provided the best fit to the data. The path diagram of the three-factor model from Dear et al. (
2011, p.20), demonstrated a similar structure to a bifactor model. Bifactor modelling has an advantage (over the three-factor solution) as bifactor indices can determine the extent to which a unidimensional (or multidimensional) interpretation is supported by the data. In addition, research has yet to compare the 16-item PSWQ to the two shortened versions, PSWQ-A and PSWQ-3, in a clinical sample of
adults with a
principal diagnosis of GAD.
The current study aimed to extend previous research by comparing the psychometric properties of the 16-item PSWQ, 8-item PSWQ-A, and 3-item PSWQ-3 in a sample of adults with a principal diagnosis of GAD. It was hypothesised that the shorter PSWQ-A unidimensional model would provide the best fit for the data, when compared to the 16-item PSWQ and 3-item PSWQ-3 (Kertz et al.,
2014). In addition, we hypothesised that both PSWQ-A and PSWQ-3 would demonstrate comparable psychometric properties to the longer PSWQ-16. Specifically, we predicted that all three versions would demonstrate no floor or ceiling effects. All three versions were predicted to demonstrate good construct validity though significant positive moderate correlations with a GAD symptom measures (i.e., physiological tension) and processes hypothesised to maintain pathological worry in cognitive-behavioural models of GAD (i.e., intolerance of uncertainty and negative metacognitive beliefs about worry). Significant moderate positive correlations were predicted for these aforementioned variables in line with previous research in mixed clinical and undergraduate samples (Kertz et al.,
2014; Wells & Cartwright-Hatton,
2004). We also predicted significant low correlations with distinct, yet overlapping psychopathology constructs, including depression and autonomic anxiety, as well as a range of metacognitive beliefs (i.e., positive beliefs about worry, lack of cognitive confidence, cognitive self-consciousness, and need for control of thoughts) in line with previous research (Kertz et al.,
2014; Wells & Cartwright-Hatton,
2004). Further, all three versions were predicted to demonstrate good internal consistency as well as reproducibility through adequate test-retest reliability over a 12-week period. Finally, we also predicted that the measure would significantly discriminate participants with GAD from non-clinical participants, with high sensitivity and specificity.
Discussion
The current study compared the psychometric properties of the 16-item PSWQ, 8-item PSWQ-A and 3-item PSWQ-3 in a clinical sample of adults with GAD. Findings demonstrated that all three versions of the PSWQ possess good psychometric properties across a range of indicators. Despite their brevity, the 8-item PSWQ-A and PSWQ-3 uphold their psychometric properties in comparison to the longer form and can be endorsed for use in both clinical and research settings for adults with GAD.
Initially six different confirmatory factor models were fit to the data, with four models incorporating 16-items from the PSWQ (i.e., unidimensional, bifactor, two factor, one factor with method effects), a unidimensional model for the 8-item PSWQ-A, and a unidimensional model for the PSWQ-3. The bifactor model appeared to fit the 16-item PSWQ better than the other PSWQ model iterations, though there was some room for improvement in absolute (i.e., χ
2 test) and incremental (TLI) fit indices. This finding endorses previous research that compared different versions of the 16-item PSWQ showing that the three-factor solution (i.e., one general factor and two method factors) provided the best fit to the data, mirroring current findings indicating that the longer form can be further improved, in a clinical GAD sample (Dear et al.,
2011). Regarding bifactor indices from the current study, the general PSWQ factor represented the dominant source of variance (ω
H) in the total PSWQ score, with the
H value indicating that the general factor was a well-defined latent variable (H). In addition, the ECV was greater than 0.70, suggesting that the general worry factor is sufficiently unidimensional to be treatment as a latent variable, thus using a total score for the PSWQ is recommended. Together, these results provide support for a strong general PSWQ factor, and unidimensionality (over multidimensionality) for the PSWQ. The unidimensional model also demonstrated the best fit the 8-item PSWQ-A across all absolute and incremental indices. This is perhaps unsurprising, as the PSWQ-A was developed by removing the negatively worded items. Regarding the PSWQ-3, comparable analyses could not be conducted on the PSWQ-3 because the model was saturated.
The psychometric properties of the 16-item PSWQ, 8-item PSWQ-A, and 3-item PSWQ-3 were largely comparable in their results and supported most hypotheses. Of note, all three versions showed moderate positive correlations with physiological stress/tension. This demonstrated construct validity as physiological tension and vigilance symptoms are part of the diagnostic criteria for GAD. All three measures also demonstrated moderate positive relationship with (1) negative metacognitive beliefs and (2) intolerance of uncertainty. These processes of are particular importance as they relate to two dominant models of GAD (Freeston,
2023). The metacognitive model proposes that excessive, pathological worry is primarily the result of negative metacognitive beliefs about worrying (i.e., that worry is uncontrollable and dangerous), in combination with positive beliefs about worry and subsequent ineffective mental control strategies (Wells,
2010). Whereas the intolerance of uncertainty model of GAD suggests that uncertainty is a natural trigger for worry, and therefore individuals who hold negative beliefs about uncertainty (i.e., intolerance of uncertainty) are more likely to experience excessive and difficult to control worry (Dugas et al.,
1998; Hebert & Dugas,
2019). The hypotheses also predicted, in keeping with previous research (Kertz et al.,
2014; Wells & Cartwright-Hatton,
2004), that less salient components of the metacognitive model, as well as distinct yet overlapping psychopathology constructs (i.e., depression and autonomic anxiety), would show positive relationships with the excessive worry. Together these findings are supported by the broader literature (Freeston,
2023), and highlight the importance for clinicians to target negative metacognitive beliefs about worry, as well as intolerance of uncertainty in psychological treatment, given that these processes were most strongly related to worry in the present study.
In terms of equivalence, all three versions demonstrated moderate test-retest reliability, excellent criterion validity, as well as the ability to distinguish adults with GAD from those with no mental health conditions in the non-clinical group. Criterion validity was assessed using a split-half sample analysis and demonstrated a strong correlation between the PSWQ and each of the two briefer versions (PSWQ-A and PSWQ-3). Though this result is not surprising, it strongly indicates that all three versions are measuring the same construct of excessive worry. One discrepancy between the three versions related to internal consistency, with the PSWQ-3 demonstrating adequate internal consistency, which was slightly lower than that for the PSWQ-A and PSWQ. This may be partly accounted for by the PSWQ-3’s substantial reduction in items, while still falling within an acceptable range (i.e.,
r > 0.70) to meet quality criteria for measurement properties set out by Terwee et al. (
2007). All three versions showed a statistically significant difference between the clinical GAD group and the non-clinical group. The PSWQ and PSWQ-A demonstrated no floor or ceiling effects, however, the PSWQ-3 demonstrated a propensity for floor effects suggesting a lack of specificity. Though floor effects were found for the PSWQ-3, ROC curve analysis demonstrated that all three versions showed an AUC close to 1.0, indicating high accuracy for each of the respective cut-off scores that maximises both sensitivity and specificity to distinguish adults with GAD from non-clinical adults. It is noteworthy that despite the substantial reduction in items, the psychometric properties of the briefer versions are largely equivalent to the 16-item PSWQ.
The two briefer versions (i.e., PSWQ-A and PSWQ-3) appear to have relatively comparable psychometric properties to the original 16-item PSWQ in a GAD sample. Researchers and clinicians who need to measure pathological worry while accounting for time constraints may consider the two shorter versions of the PSWQ: the PSWQ-A and PSWQ-3. These briefer forms are particularly beneficial in clinical assessments where a broad range of symptoms are being evaluated, as they help mitigate client questionnaire fatigue. Both the PSWQ-A and PSWQ-3 demonstrate psychometric properties comparable to the full PSWQ, making them effective and efficient options for inclusion in assessment batteries. Another instance where time constraints may favour the use of a briefer version is in tracking weekly treatment progress for adults with GAD. The ultra-brief PSWQ-3 is particularly advantageous in this context due to its quick administration time and its resistance to floor effects within a GAD sample. Another consideration for researchers and clinicians to consider when using the 16-item PSWQ is that not only do the negatively worded items appear to increase the cognitive load when completing the 16-item PSWQ, they can make real-time scoring on pen and paper forms difficult for clinicians. As without a reverse-score template, clinicians are not able to quickly inspect whether the general pattern of worry is decreasing (or increasing) over the course of treatment. One way to overcome this difficulty for clinicians, would be to automate scoring for the 16-item PSWQ through a secure survey platform (i.e., REDCap). However, for clinicians who prefer to stick to pen and paper forms, the briefer versions of the PSWQ provide a readily available solution, that is psychometrically comparable. Thus, not only do the briefer versions avoid the methodological problems related to the reverse scored items on the 16-item PSWQ, but they also increase the speed of completion for participants or consumers in clinical practice.
It is important to highlight potential limitations of the study. First, the PSWQ-A and PSWQ-3 were not administered separately from the 16-item PSWQ, similarly to the methodology used in Wuthrich et al. (
2014). Though this avoids potential problems with repeated measure effects, it prevents direct comparison of the measure’s performance outside of the context of the full version. This limitation may also mean that there is an overestimation of the correlation between the versions, as the items on the 8-item and 3-item form are counted in both sides of the correlation (Smith et al.,
2000). Item position has also been shown to impact questionnaire results (Podsakoff et al.,
2012), which may have influenced the current findings. In addition, treatment sensitivity was not examined in the current study, previous research has found that the PSWQ (Dear et al.,
2011) and the PSWQ-3 (Berle et al.,
2011) were sensitive to evidence-based treatment with significant reductions following CBT, however, the PSWQ-A is yet to be assessed in adults with GAD. Another limitation of the current study was the inability to report on ethnicity and race for both clinical and non-clinical groups. The current sample had a large proportion of missing data for ethnicity in the GAD sample, with only 18.6% of the clinical group reporting their ethnicity. Future research should endeavour to capture ethnicity and race in samples, so that cultural differences can be explored and understood in the reporting of the short and longer forms of the PSWQ. In addition, a relatively small sub-sample of participants with GAD to assess test-retest reliability. This is a methodological limitation as quality criteria set out by Terwee et al. (
2007) suggests that a sample size of at least 50 participants is required to meet adequate quality criteria for assessing psychometric properties of a questionnaire. Thus, the current study provides preliminary evidence that all three versions of the PSWQ demonstrate adequate test-retest reliability in adults with GAD, and future research should attempt to replicate this in a larger sample.
Overall, the shorter versions (PSWQ-A and PSWQ-3) performed with psychometric equivalence to the full 16-item PSWQ in a sample of individuals with GAD. Therefore, clinicians and researchers may prefer to utilise these self-report instruments, since they are both quicker to complete and score, and avoid scoring and psychometric issues caused by reverse-scored items, thereby potentially easing questionnaire burnout in the research space as well as facilitating real-time discussions in clinical practice.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.