Journal of Rehabilitation Medicine 51-3 | Page 50

196 R. Maritz et al. Level 1 all 4 subsamples were analysed separately (MSKt1, MSKt2, NEURt1 and NEURt2). In Level 2 the rehabilitation group and time-point subsamples were aggregated respectively (MSKt1&t2, NEURt1&t2, t1MSK&NEUR, t2MSK&NEUR). Level 3 represents the aggregation of all 4 subsamples, i.e. the entire calibration sample (FIM_all). Together, these 3 aggrega- tion levels resulted in 9 analysis steps. For both testlet approaches, the emphasis is on making exis- ting assessment tools work without the need to delete items or change the scoring structure. Differential Item Functioning strategy DIF was analysed in situations in which local dependencies could be accommodated satisfactorily with testlets. Where a lack of group invariance was observed, the testlets for the con- textual factor were split on the basis of the strongest DIF, and continued until no further DIF was present (33). The split and unsplit solutions were then compared with each other on the basis of the Rasch person estimates, anchored to each other with an unsplit item free of DIF. An effect size calculation, based on the mean of the person estimates, their standard deviations, and the correlation of the split and unsplit version (34) was applied to determine whether DIF split was necessary for the final transformation table. If the effect size was below 0.2, DIF was considered small (35) and no action was taken to adjust for DIF. Transformation table The second specific aim of this study was to develop a transfor- mation table in case fit to the Rasch model could be achieved. The solution with the best fit to the Rasch model was taken as a basis for this transformation, i.e. the solution with the most satisfactory core values for the entire calibration sample. The transformation table from FIM™ raw ordinal total scores to the corresponding interval-scaled values was based on the respec- tive estimates according to the Rasch model. RESULTS Sample characteristics The calibration sample included 946 cases. Of these, 476 were musculoskeletal cases and 470 neurological cases. A total of 474 cases were from time-point 1 ad- mission, and 472 from time-point 2 discharge (see Fig. 1). FIM™ total scores had a mean of 81.7 (standard deviation (SD) = 27.5, median = 84). The mean age of subjects in the calibration sample was 71.6 years (SD = 14.5, 20–102 years). The calibration sample was 43% (n = 403) male and 57% (n = 543) female; 41% (n = 392) were from the German-speaking region of Switzerland, 25% (n = 238) from the French-speaking region and 34% (n = 316) from the Italian-speaking re- gion; 84% (n = 798) of the sample were Swiss and 16% (n = 148) had another nationality. Insurance status was: 67% (n = 633) general, 18% (n = 172) semi-private, and 15% (n = 141) private. Baseline Rasch analysis In the 9 baseline analysis steps across the 3 aggregation levels of the calibration sample, no fit to the Rasch mo- del was achieved (Table I). In all analyses the p-values of the item-trait χ 2 were significant. Furthermore, in all analysis steps there were items that showed local dependencies among each other, DIF and threshold disordering. Information on threshold disordering and local dependency of the baseline analyses are shown in Appendix S1 1 . http://www.medicaljournals.se/jrm/content/?doi=10.2340/16501977-0000 1 Table I. Functional Independence Measure (FIM™) baseline analyses Person-fit residuals Mean (SD) χ 2 p-value PSI α DIF (items) 0.193 (2.496) –0.183 (1.304) 0.000 0.961 0.967 230/4 476/8 0.098 (2.191) 0.193 (3.255) –0.165 (1.359) –0.155 (1.280) 0.000 0.000 0.966 0.963 0.968 0.967 NEUR_t1 NEUR_t2 NEUR_all 228/4 242/4 470/8 –0.046 (3.559) –0.461 (3.449) –0.369 (4.919) –0.314 (1.745) –0.358 (1.595) –0.349 (1.678) 0.000 0.000 0.000 0.964 0.964 0.963 0.972 0.973 0.972 t1_all 474/8 0.101 (4.274) –0.239 (1.609) 0.000 0.96 0.968 t2_all 472/8 –0.284 (3.957) –0.293 (1.553) 0.000 0.964 0.971 FIM_all 946/10 –0.077 (5.779) –0.265 (1.609) 0.000 0.962 0.969 SD < 1.4 > 0.01 > 0.7 > 0.7 age (M), language (A, B, D, F, L, R) language (B, D, F, L, N, P) gender (Q), age L, N), language (B, C, D, F, H, L, M, N, Q, R), time-point (L, M, N, O) language (Q) No DIF language (D, F, M, N, P, Q), time-point (L) age (F, I, J, N, Q, R), language (B, D, F, L, N, Q), rehab-group (C, E, K, M, O, P, Q, R) language (B, D, M, N, Q), rehab-group (C, E, K, L, O, P, Q) gender (L), age (N, O), language (B, D, F, H, L, M, N, Q, R), nationality (Q), insurance (O), time-point (L, M), rehab-group (C, E, K, L, M, O, P, Q, R) No DIF present Sample n/CI MSK_t1 246/4 MSK_t2 MSK_all Acceptable values Item-fit residuals Mean (SD) SD < 1.4 Paired t-test (Lower ci %), % 9.8 (0.0) 17.4 (0.0) 16.2 (0.0) 17.1 (14.3) 15.3 (12.5) 15.3 (13.3) 12.9 (10.9) 13.1 (11.2) 11.1 (9.7) At least Lower ci < 5 MSK: musculoskeletal rehabilitation; NEUR: neurological rehabilitation; t1: admission; t2: discharge; all: combination of time-points and/or rehabilitation-groups; n: sample size; CI: class intervals; SD: standard deviation; PSI: Person separation index; α: Cronbach’s alpha; DIF: differential item functioning; ci: confidence interval. www.medicaljournals.se/jrm