Participants
This is a secondary analysis of data from a general US population probability-based sample. Participants were recruited from KnowledgePanel in September and October 2022 [
17]. KnowledgePanel is a high-quality, probability-based panel whose members are recruited through an address-based sample method utilizing the most recent delivery sequence file of the US Postal Service. A random sample of 7,224 from the approximately 55,000 KnowledgePanel members were offered the opportunity to participate in the survey [
18,
19]. The KnowledgePanel conducted several quality control measures, and the research team included 2 fake conditions within a list of chronic health conditions to identify and exclude careless or insincere respondents [
20]. Of those, 4,149 participants agreed to participate but 19 were excluded because of endorsing one of the two fake conditions (“Syndomitis” and “Checkalism”) [
20]. Of the remaining 4,130 baseline participants, those experiencing back pain (
n = 1,533) were selected for a follow-up survey. A total of 277 did not complete the 6-month follow-up survey, leaving 1,256 participants in the 6-month follow-up analysis sample (Supplemental Figure
S1).
The study protocol was reviewed and approved by the research team’s institutional review board (RAND Human Subjects Research Committee FWA00003425; IRB00000051). The data set analyzed for this study is publicly available from the ICPSR database repository number openicpsr-198,049.
Measures
Participant information: At baseline, participants were asked demographic questions and whether they had any chronic conditions including hypertension, high cholesterol, coronary heart disease, angina, heart attack, stroke, asthma, cancer, diabetes, chronic obstructive pulmonary disease, arthritis or rheumatoid arthritis, anxiety disorder, depression, chronic allergies, back pain, chronic back pain, sciatica, neck pain, trouble seeing, dermatitis, stomach trouble, trouble hearing, trouble sleeping, and 2 fake conditions (“Syndomitis” and “Checkalism”). At 6-month follow-up, participants were asked again if they had hypertension, anxiety, and depression.
PROMIS items: As part of the larger study, participants were asked 4 to 8 items from 8 of the PROMIS domain item banks for a total of 50 PROMIS items. Participants answered all items from a domain (e.g., 8 items from the PROMIS Physical Function item bank) before answering the next domain. The selected items included all items in the PROMIS-29 + 2 and PROMIS-16 described below as well as 14 additional items. The PROMIS-29 + 2 and the PROMIS-16 share 11 items.
PROMIS-29 + 2: The PROMIS-29 + 2 Profile evaluates 8 health domains: physical function, ability to participate in social roles and activities, anxiety, depression, sleep disturbance, pain interference, and fatigue with 4 items per domain; cognitive function – abilities with 2 items; and pain intensity with a single item [
5]. The domains were scored using IRT-based T-scores from standard PROMIS documentation. T-scores are designed such that 50 is the population mean with a standard deviation of 10. Higher values indicate more of the concept being measured (i.e., higher scores indicate better HRQoL in functioning domains and higher scores indicate worse HRQoL in symptom domains). Physical health and mental health summary scores were also calculated from the PROMIS-29 + 2 [
7]. Participants were asked to complete the PROMIS-29 + 2 at baseline and 6-month follow-up.
PROMIS-16: The PROMIS-16 Profile evaluates 8 health domains: physical function, ability to participate in social roles and activities, anxiety, depression, sleep disturbance, pain interference, cognitive function –abilities, and fatigue, with 2 items per domain. We generate IRT-based T-scores for each domain following PROMIS conventions as described in Edelen et al. [
6]. Participants were asked the PROMIS-16 items at baseline and 6-month follow-up.
PROPr: The PROPr score is calculated from 7 PROMIS domain scores: cognitive function – abilities, depression, fatigue, pain interference, physical function, sleep disturbance, and ability to participate in social roles [
14]. The PROPr scoring algorithm is linked directly to the PROMIS domain T-scores, rather than to individual items, allowing the domain scores to be collected by different administration methods (e.g., computer adaptive test, 4-item short form, 2-item short form). The PROPr scoring algorithm was developed from standard gamble valuations from a US sample of 943 adults. Possible PROPr scores range from − 0.022 to 1.0 with dead anchored at 0 and full health anchored at 1.0 [
11]. PROPr scores were calculated using each profile, hereafter referred to as the “PROPr
16” and “PROPr
29 + 2.”
Pain-specific measures: To validate PROPr scores derived from the PROMIS-16, we included 4 pain measures, the: (1) Oswestry Disability Index (ODI) [
21,
22], (2) Roland-Morris Disability Questionnaire (RMDQ) [
23], (3) Pain Intensity, Interference with Enjoyment of Life, Interference with General Activity Scale (PEG) [
24], and (4) Graded Chronic Pain Scale (GCPS) [
25]. The ODI measures pain interference and functional disability by using 10 items that assess pain intensity, personal care, lifting, walking, sitting, standing, sleeping, sex life, social life, and traveling. Each item is rated on a 0 to 5 scale, yielding a total sum score ranging from 0 to 50. We transformed the sum score to a percentage scale from 0 to 100, categorizing disability into minimal (0–20%), moderate (21–40%), severe (41–60%), disabling (61–80%), and bedridden or functional impairment (81– 100%). The RMDQ, with a range from 0 to 24, evaluates if back pain has an impact on 24 daily activities, with higher scores indicating greater impact. The PEG uses a single item to assess pain intensity and 2 items to assess interference with enjoyment of life and general activities. Each item is rated on a 0 to 10 scale and the total score, ranging from 0 to 10, is the average of these 3 item scores. Lastly, the GCPS has 3 pain intensity items and 4 disability items. Following previous studies, we scored GCPS and classified the severity as (1) no pain, (2) low disability – low intensity, (3) low disability – high intensity, (4) high disability – moderately limiting, and (5) high disability – severely limiting [
26]. Participants were asked to complete all 4 pain measures at baseline and 6-month follow-up.
Analysis
First, we examined the difference in the score distributions between PROPr
16 and PROPr
29 + 2 using the Kolmogorov-Smirnov test and score correlations using product-moment correlations [
27]. Second, we calculated mean scores for the overall sample and subsets with different health conditions and calculated the standardized mean difference (Cohen’s d) between group mean estimates from PROPr
16 and PROPr
29 + 2; a Cohen’s d statistic less than 0.2 indicates a trivial difference [
28]. Third, PROPr score correlations with other pain and disability measures (GCPS, ODI, RMDQ, and PEG) and PROMIS summary scores were calculated by using product-moment correlations [
27]. Fourth, the impact of the respondent’s health condition on the PROPr
16 and PROP
29 + 2 scores were estimated in linear multivariable regression analyses controlling for age and sex. The regression coefficient for the health condition is its impact estimate. Fifth, we created a Bland-Altman plot with the average of the PROPr
16 and PROPr
29 + 2 to help identify any systematic differences between the PROPr
16 and PROP
29 + 2 scores [
29]. The 95% upper and lower limits of agreement (bias) are estimated using: mean ± SD (mean difference) * 1.96. Scatter bias is present when the amount of disagreement varies by the average of the two estimates. Finally, participants were categorized into 3 groups based on the change in pain severity from baseline to the 6-month follow-up: “decreased,” “no change,” or “increased” that was measured using ODI and GCPS [
22,
25]. For each group, we calculated the change of PROPr
16 and PROPr
29 + 2 scores. We then examined whether the differences between the 2 profiles in these changes were statistically significant by using the Wilcoxon rank sum test to account for the non-normality of the change score distributions.
A 2-sided p-value less than 0.05 was considered statistically significant for all statistical analyses. Analyses were performed in SAS version 9.4.