Extraction form for project: SGS 2022 POP PROM

Design Details

1. Duplicate?
This question will allow you to skip over redundant questions. If the answer is yes, you may be able to skip the Risk of Bias sheet (unless the subscales etc. were assessed with poorer (or more incomplete) methodology than the main scale.
Is this a secondary extraction for a sub-scale of an already extracted scale?
Is this an extraction for a >=2nd PROM from an already extracted article (and the population etc. is the same)?
2. Population (POP eligibility criteria)
3. Country
4. Specific setting or location
5. Specific feature that might affect generalizability
6. No. participants analyzed
7. Age
Low boundHigh bound
Mean
SD
SE
95% CI
Median
Interquartile range (IQR)
Full range (might come from eligibility criteria)
8. POP-Q, %
0-100% (not proportion 0-1)
Low bound
NR
Stage 0
Stage 1
Stages 0-1
Stage 2
Stage 3
Stage 4
Stages 3-4
Other POP-Q info
9. PROM evaluated
Be specific, particularly for subscales, domains, etc.
Name of full measure
Subscale name
10. Reference(s) for original version of PROM
If the article references the original version of the PROM (or the creation of the PROM), list the references (either PMID(s) or full reference
11. Reference(s) for other validations of PROM
If the article references other articles that validate PROM, list the references (either PMID(s) or full reference
12. PROM description
For subscales, etc. can skip if this is all identical to the overall (already-extracted) PROM (of if not relevant). For "List of Domains", please copy and paste the names of the domains, separated by commas. Note that the worst score may be the minimum or maximum score (same for best score).
No. of items
No. domains
List of domains
If domains, can one report at the domain level
How is PROM scored (e.g., Likert 1-5 per question)
"Worst" score
"Best" score
How to access/where it is available
Method(s) of administration in the article
13. Average PROM score in study
Or subscale/domain/etc. score
Low boundHigh bound
Mean
SD
SE
95% CI
Median
Interquartile range (IQR)
Full range
14. Length of time to complete
Low boundHigh bound
Mean
SD
SE
95% CI
Median
Interquartile range (IQR)
Full range
Units
15. Percent of missing PROM scores (or subscale etc. scores)
0-100% (not 0-1 proportion)
16. Was CONTENT or FACE VALIDITY assessed?
You can use the f/up question text box if you have a comment about this question
17. Content/Face validity, per authors
Did the authors claim content or face validity? Copy and paste relevant text.
Y/N
Article text
Note/comment
18. CONTENT/FACE VALIDITY: Extractor comments
19. Was STRUCTURAL VALIDITY assessed?
You can use the f/up question text box if you have a comment about this question
20. Structural Validity, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
A: Confirmatory factor analysis (CFA)/Unidimensionality: Comparative fit index (CFI)
B: CFA/Unidimensionality: Tucker‐Lewis index (TLI)
C: CFA/Unidimensionality: other measure comparable to CFI or TLI (name in COSMIN threshold column)
D: CFA/Unidimensionality: Root Mean Square Error of Approximation (RMSEA)
E: CFA/Unidimensionality: Standardized Root Mean Residuals (SRMR)
F: Local independence: Maximum residual correlations among the items after controlling for the dominant factor
G: Local independence: Maximum Q3
H: Monotonicity: Adequate looking graphs
I: Monotonicity: Item scalability
J: Model fit: χ2
K: Model fit: Infit mean squares
L: Model fit: Outfit mean squares
M: Model fit: Z-standardized value
Other info on structural validity
21. Structural validity, per Classical Test Theory (CTT)
Add comments in the follow-up question, if you'd like
22. Structural validity, per Item Response Theory (IRT)/Rasch
23. Structural validity, per authors
Did the authors claim structural validity? Copy and paste relevant text.
Y/N
Article text
Note/comment
24. STRUCTURAL VALIDITY: Extractor comments
25. Was INTERNAL CONSISTENCY assessed?
You can use the f/up question text box if you have a comment about this question
26. Internal consistency, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
N: Sufficient structural validity
O: Cronbach's alpha(s) (for each unidimensional scale or subscale)
Other info on internal consistency
27. Internal consistency
28. Internal consistency, per authors
Did the authors claim internal consistency? Copy and paste relevant text.
Y/N
Article text
Note/comment
29. INTERNAL CONSISTENCY: Extractor comments
30. Was RELIABILITY assessed?
You can use the f/up question text box if you have a comment about this question
31. Reliability, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
Type
P: Intraclass correlation coefficient (ICC)
Q: Weighted kappa
Other info on reliability
32. Reliability
33. Reliability, per authors
Did the authors claim reliability? Copy and paste relevant text.
Y/N
Article text
Note/comment
34. RELIABILITY: Extractor comments
35. Was MEASUREMENT ERROR assessed?
You can use the f/up question text box if you have a comment about this question
36. Measurement error, data
ENTER MIC IN COSMIN COLUMN FOLLOW-UP QUESTION (select the <MIC text first)
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
R: Smallest detectable change (SDC)
S: Limits of agreement (LoA)
Time interval
Other info on measurement error
37. Measurement error
38. Measurement error, per authors
Did the authors claim small measurement error? Copy and paste relevant text.
Y/N
Article text
Note/comment
39. MEASUREMENT ERROR: Extractor comments
40. Was CONSTRUCT VALIDITY assessed?
You can use the f/up question text box if you have a comment about this question
41. Construct validity
What is the hypothesis (eg, measure will be lower with improvement in symptoms)
Type, which comparators? (in f/up Q)
Construct validity?
Supporting data
42. Construct validity, per authors
Did the authors claim construct validity? Copy and paste relevant text.
Y/N
Article text
Note/comment
43. CONSTRUCT VALIDITY: Extractor comments
44. Was CROSS-CULTURAL VALIDITY/MEASUREMENT INVARIANCE assessed?
You can use the f/up question text box if you have a comment about this question
45. Cross‐cultural validity\measurement invariance, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
Groups compared/evaluated
T: No important differences found between group factors (such as age, gender, language) in multiple group factor analysis
U: McFadden's R^2, differential item functioning (DIF) group factors
Other info on measurement invariance
46. Cross‐cultural validity\measurement invariance
Measurement invariance?
47. Cross‐cultural validity\measurement invariance, per authors
Did the authors claim measurement invariance? Copy and paste relevant text.
Y/N
Article text
Note/comment
48. CROSS-CULTURAL VALIDITY/MEASUREMENT INVARIANCE: Extractor comments
49. Was CRITERION VALIDITY assessed?
You can use the f/up question text box if you have a comment about this question
50. Criterion validity, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
Gold standard
V: Correlation with gold standard
W: Area under curve (AUC)
Other info on criterion validity
51. Criterion validity
Measurement invariance?
52. Criterion validity, per authors
Did the authors claim criterion validity? Copy and paste relevant text.
Y/N
Article text
Note/comment
53. CRITERION VALIDITY: Extractor comments
54. Was RESPONSIVENESS assessed?
You can use the f/up question text box if you have a comment about this question
55. Responsiveness, data
The "COSMIN threshold" column is FYI. Do not enter data here.
Value reportedCOSMIN thresholdCriterion met? (eg, beyond threshold)Other info (eg, what was comparator test or subgroups compared)Note/column
Other measures, or treatment, or time period
X: The result is in accordance with the hypothesis
Y: Area under curve (AUC)
Other info on reliability
56. Responsiveness
Measurement invariance?
57. Responsiveness, per authors
Did the authors claim criterion validity? Copy and paste relevant text.
Y/N
Article text
Note/comment
58. RESPONSIVENESS: Extractor comments
59. Was FLOOR/CEILING EFFECT assessed?
You can use the f/up question text box if you have a comment about this question
60. Floor/Ceiling effect
0-100% (not 0-1 proportion)
% with floor value
% with ceiling value
NR
61. FLOOR/CEILING EFFECT: Extractor comments
62. Was MCID assessed?
You can use the f/up question text box if you have a comment about this question
63. MCID reported
MCID value
How determined?
64. MCID: Extractor comments
65. Other information about validity, reliability, measurement error, etc.?
66. Other comments/notes
67. 2nd review done?
Add your name when you've completed the second review. Add comments or questions or notes, if necessary
Name
Comments

Risk of Bias Assessment

1. Constructs assessed
You need to answer this question to get to the appropriate RoB questions
2. Structural validity/Classical Test Theory: Was exploratory or confirmatory factor analysis performed?
Adequacy
Extractor comment
3. Structural validity/Item Response Theory/Rasch: Does the chosen model fit to the research question?
Adequacy
Extractor comment
4. Structural validity: Was the sample size included in the analysis adequate?
*VERY GOOD*: FA: N≥7 times the number of items, and N≥100 Rasch/1PL models: N≥200 2PL parametric IRT models OR Mokken scale analysis: N≥1000 *ADEQUATE* FA: N at least 5 times the number of items, and N≥100; OR at least 6 times the number of items (if N<100) Rasch/1PL models: N 100‐199 2PL parametric IRT models OR Mokken scale analysis: N 500‐999 *DOUBTFUL* FA: N 5 times the number of items, but N<100 Rasch/1PL models: N 50‐99 2PL parametric IRT models OR Mokken scale analysis: N 250‐499 *INADEQUATE* FA: N<5 times the number of items Rasch/1PL models: N< 50 2PL parametric IRT models OR Mokken scale analysis: N<250
Adequacy
Extractor comment
5. Structural validity: Were there any other important flaws in the design or statistical methods of the study?
Describe other flaws
Adequacy
Extractor comment
6. Internal consistency: Was an internal consistency statistic calculated for each unidimensional scale or subscale separately?
Adequacy
Extractor comment
7. Internal consistency/continuous scores: Was Cronbach’s alpha or omega calculated?
Adequacy
Extractor comment
8. Internal consistency/dichotomous scores: Was Cronbach’s alpha or KR‐20 calculated?
Adequacy
Extractor comment
9. Internal consistency/IRT‐based scores: Was standard error of the theta (SE (θ)) OR reliability coefficient of estimated latent trait value (index of (subject or item) separation) calculated?
Adequacy
Extractor comment
10. Cross‐cultural validity/measurement invariance: Were the samples similar for relevant characteristics except for the group variable?
Adequacy
Extractor comment
11. Cross‐cultural validity\Measurement invariance: Was an appropriate approach used to analyze the data?
Adequacy
Extractor comment
12. Cross‐cultural validity\Measurement invariance: Was the sample size included in the analysis adequate?
*VERY GOOD*: Regression analyses or IRT/Rasch based analyses: N 200/group MGCFA: N 7 times the number of items, and N≥100 *ADEQUATE* Regression analyses or IRT/Rasch based analyses: N 150/group MGCFA: N 5 times the number of items, and N≥100 OR 5‐7 times the number of items, but N<100 *DOUBTFUL* Regression analyses or IRT/Rasch based analyses: N 100/group MGCFA: N 5 times the number of items, but N<100 *INADEQUATE* Regression analyses or IRT/Rasch based analyses: N<100/group MGCFA: N <5 times the number of items
Adequacy
Extractor comment
13. Reliability: Were patients stable in the interim period on the construct to be measured?
Adequacy
Extractor comment
14. Reliability: Was the time interval appropriate?
Adequacy
Extractor comment
15. Reliability: Were the test conditions similar for the measurements?
E.g. type of administration, environment, instructions
Adequacy
Extractor comment
16. Reliability/continuous scores: Was an intraclass correlation coefficient (ICC) calculated?
*VERY GOOD* ICC calculated and model or formula of the ICC is described *ADEQUATE* ICC calculated but model or formula of the ICC not described or not optimalOR Pearson or Spearman correlation coefficient calculated with evidence provided that no systematic change has occurred *DOUBTFUL* Pearson or Spearman correlation coefficient calculated WITHOUT evidence provided that no systematic change has occurred or WITH evidence that systematic change has occurred *INADEQUATE* No ICC or Pearson or Spearman correlations calculated
Adequacy
Extractor comment
17. Reliability/Dichotomous, nominal, ordinal scores: Was kappa calculated?
Adequacy
Extractor comment
18. Reliability/ordinal scores: Was a weighted kappa calculated?
Adequacy
Extractor comment
19. Reliability/ordinal scores: Was the weighting scheme described?
E.g. linear, quadratic
Adequacy
Extractor comment
20. Reliability: Were there any other important flaws in the design or statistical methods of the study?
Describe other flaws
Adequacy
Extractor comment
21. Measurement error: Were patients stable in the interim period on the construct to be measured?
Adequacy
Extractor comment
22. Measurement error: Was the time interval appropriate?
Adequacy
Extractor comment
23. Measurement error: Were the test conditions similar for the measurements?
E.g., type of administration, environment, instructions
Adequacy
Extractor comment
24. Measurement error/continuous scores: Were the Standard Error of Measurement (SEM), Smallest Detectable Change (SDC) or Limits of Agreement (LoA) calculated?
Adequacy
Extractor comment
25. Measurement error/Dichotomous, nominal, ordinal scores: Was the percentage (positive and negative) agreement calculated?
Adequacy
Extractor comment
26. Criterion validity/continuous scores: Were correlations, or the area under the receiver operating curve calculated?
Adequacy
Extractor comment
27. Criterion validity/dichotomous scores: Were sensitivity and specificity determined?
Adequacy
Extractor comment
28. Criterion validity: Were there any other important flaws in the design or statistical methods of the study?
Describe other flaws
Adequacy
Extractor comment
29. Construct validity/Convergent: Is it clear what the comparator instrument(s) measure(s)?
Comparison with other outcome measurement instruments
Adequacy
Extractor comment
30. Construct validity/Convergent: Were the measurement properties of the comparator instrument(s) sufficient?
Comparison with other outcome measurement instruments *VERY GOOD* Sufficient measurement properties of the comparator instrument(s) in a population similar to the study population *ADEQUATE* Sufficient measurement properties of the comparator instrument(s) but not sure if these apply to the study population *DOUBTFUL* Some information on measurement properties of the comparator instrument(s) in any study population *INADEQUATE* No information on the measurement properties of the comparator instrument(s), OR evidence of insufficient measurement properties of the comparator instrument(s)
Adequacy
Extractor comment
31. Construct validity/Convergent: Were design and statistical methods adequate for the hypotheses to be tested?
Comparison with other outcome measurement instruments
Adequacy
Extractor comment
32. Construct validity/Discriminative or known-groups: Was an adequate description provided of important characteristics of the subgroups?
Comparison between subgroups
Adequacy
Extractor comment
33. Construct validity/Discriminative or known-groups: Were design and statistical methods adequate for the hypotheses to be tested?
Comparison between subgroups
Adequacy
Extractor comment
34. Responsiveness/Criterion approach/Continuous: Were correlations between change scores, or the area under the Receiver Operator Curve (ROC) curve calculated?
Comparison to a gold standard
Adequacy
Extractor comment
35. Responsiveness/Criterion approach/Dichotomous: Were sensitivity and specificity (changed versus not changed) determined?
Comparison to a gold standard
Adequacy
Extractor comment
36. Responsiveness/Construct approach (other PROM): Is it clear what the comparator instrument(s) measure(s)?
hypotheses testing; comparison with other outcome measurement instruments
Adequacy
Extractor comment
37. Responsiveness/Construct approach (other PROM): Were the measurement properties of the comparator instrument(s) sufficient?
Hypotheses testing; comparison with other outcome measurement instruments *VERY GOOD* Sufficient measurement properties of the comparator instrument(s) in a population similar to the study population *ADEQUATE* Sufficient measurement properties of the comparator instrument(s) but not sure if these apply to the study population *DOUBTFUL* Some information on measurement properties of the comparator instrument(s) in any study population *INADEQUATE* NO information on the measurement properties of the comparator instrument(s) OR evidence of insufficient quality of comparator instrument(s)
Adequacy
Extractor comment
38. Responsiveness/Construct approach (other PROM): Were design and statistical methods adequate for the hypotheses to be tested?
Hypotheses testing; comparison with other outcome measurement instruments
Adequacy
Extractor comment
39. Responsiveness/Construct approach (other PROM): Were there any other important flaws in the design or statistical methods of the study?
Describe other flaws
Adequacy
Extractor comment
40. Responsiveness/Construct (subgroups): Was an adequate description provided of important characteristics of the subgroups?'
Comparison between subgroups
Adequacy
Extractor comment
41. Responsiveness/Construct (subgroups): Were design and statistical methods adequate for the hypotheses to be tested?'
Comparison between subgroups
Adequacy
Extractor comment
42. Responsiveness/Construct (before-after):Was an adequate description provided of the intervention given?
Adequacy
Extractor comment
43. Responsiveness/Construct (before-after): Were design and statistical methods adequate for the hypotheses to be tested?'
Before and after intervention
Adequacy
Extractor comment
44. MCID: Was the method to define MCID appropriate?
NOTE: THIS QUESTION IS NOT IN COSMIN
Adequacy
Extractor comment