Journal of Rehabilitation Medicine 51-9 | Page 30

654 E. D. Hernández et al. Values within –0.1 and 0.1 were considered negligibly small with reference to clinical relevance, while values outside this range were considered as clinically relevant disagreements (33). The RV indicates disagreement caused by individual variability and varies between 0 and 1 and a value < 0.1 means that the difference is negligible. Statistically significant disagreement of RP, RC and RV was indicated with a 95% confidence interval (95% CI) that did not include the value zero. Scatterplot and relative operating curve (ROC) were used to visually analyse the systematic disagreements. The degree of agreement was deter- mined by using the percentage of agreement (PA). Agreement ≥ 70% was considered satisfactory. For the summed scores (subscale and total scores), a minimum disagreement in points to reach at least 70% PA was also calculated. RESULTS In total 105 patients were screened, of whom 60 (48% women, mean age 65.9 years) met the inclusion criteria and were assessed with the FMA-UE (Table I). The main reason for exclusion was severe cognitive impair- ment that hindered cooperation during the assessment (n = 21) (Fig. 1). Among the included patients, 93% had ischaemic stroke and 7% haemorrhagic stroke. The FMA-UE scores of the entire group ranged from 4 to 66 points. Out of 60 patients 25% scored ≤ 48 and 25% ≥ 65. There was no floor effect observed, since all patients received some points on the first occasion. However, 13 patients (21.7%) received a full score of 66 points on the first occasion, which indicates a ceiling effect. Table I. Demographic and clinical characteristics (n  = 60) Characteristics Age, years, mean (SD) Sex, male/female, % Ischaemic/haemorrhagic stroke, % Right/left hemiparesis, % Thrombolysis, n Hospitalization, days, mean (SD) Modified Rankin Scale, median (Q1–Q3) 0 Without symptoms 1 Without significant disability 2 Mild disability 3 Moderate disability 4 Moderately severe disability 5 Severe disability NIHSS Scale, median (Q1–Q3) Mild 0–5 Moderate 6–14 Severe 15–24 Very severe ≥ 25 Patients without NIHSS scorings Discharged from hospital, n Home Homecare Intermediate care Died in hospital Fugl Meyer Assessment of upper extremity FMA-UE, 1 st occasion, median (Q1–Q3) FMA-UE, 2 nd occasion, median (Q1–Q3) 65.9 (17.3) 52/48 93/7 55/45 8 12 (10) 2 (1–4) 3 22 10 5 16 4 5 (3–10) 25 20 2 0 13 56 1 1 2 At the item level, statistically significant systematic disagree­ment of relative position (RP) was noted for shoulder flexion 0–90° (A.III.) and normal reflex ac- tivity (A.V., Table II). All these disagreements were positive, which indicate that a higher category was systematically more frequently used for these items on the second occasion. A negative RC value was noted for one of the raters for elbow extension and forearm pronation within extensor synergy, which means that a more central scoring was more often used on the first occasion compared with the second within the same rater. This disagreement showed the same tendency, as seen in RP values, indicating that a higher score was more frequently used on the second test occasion compared with the first for these items. A shift towards higher score was also seen in the total score A–D. Individual disagreements, measured as RV, were all close to zero across all raters. Scatterplots showing paired intra-rater and inter-rater assessments of the total score A–D along with ROC are presented in Fig. 2. A curved ROC indicates disagreement in position and an S-shaped curve indicates that the raters concentrate their assessments differently on the scale categories. Exact RP and RC values along with 95% CI are displayed in Tables SI–II 1 . The PA between test occasion 1 and 2 within each rater was above 79% for all tested items (Tables II and III). For the reflex activity (A.I.), full agreement was reached. Full agreement at least in one rater was also noted for following items: hand to lumbar spine, mass flexion and extension of the hand, cylinder and spherical grasp. The PA was, as expected, lower for the subscale A (48–59%), B, C and D (63–89%), and for the total score A–D (33–46%), than for single items, since the sum-scores include larger number of catego- ries. A 70% PA was reached for subscale B, C and D when a 1-point difference between test occasions was accepted. Two- and 3-point difference was needed to reach 70% PA in all 3 raters for the subscale A and the total score A–D, respectively. Inter-rater reliability A statistically significant systematic disagreement in RC was noted for the forearm pronation (A.II.), which means that the rater with a role of leader was syste- matically using a more central score compared with the rater who acted as observer (Table II). All other observed systematic disagreements were negligible or not statistically significant. Individual disagreements, 58 (48–65) 59.5 (45–66) FMA-UE: Fugl-Meyer Assessment Upper Extremity; SD: standard deviation; NIHSS: National Institutes of Health Stroke Scale. www.medicaljournals.se/jrm Intra-rater reliability http://www.medicaljournals.se/jrm/content/?doi=10.2340/16501977-2590 1