684
T. Winairuk et al.
orientation, and gait stability. The BESTest has been
validated to assess postural control impairments in va-
rious populations (8–11). In individuals with subacute
stroke, the BESTest showed excellent intra-rater and
inter-rater reliability as well as significant correlation
with the BBS, Postural Assessment Scale for Stroke
(PASS), and Community Balance and Mobility scale
(CB&M) suggesting concurrent validity (12). With
high sensitivity (80.8%), specificity (87.5%) and
post-test accuracy (84%), the BESTest demonstrated
precision in specifying patients with stroke who have
balance improvements, using a 10% increase in score
as an indicator (13). The BESTest showed an advantage
over other balance assessment tools in patients with
stroke when it did not have floor or ceiling effects,
but the long administration time (35 min) can limit its
practicality in the clinic.
Two short versions of the BESTest, which can reduce
the assessment time are currently available. The brief-
BESTest, which contains only 6 items, one for each
domain of the BESTest, was validated in patients with
chronic stroke, but has not been validated in patients
with subacute stroke (14, 15). The Mini-BESTest is
another short version, which deletes the first and se-
cond domains of the BESTest to evaluate the dynamic
component of postural control (16). The Mini-BESTest
showed excellent internal consistency for community-
dwelling patients with chronic stroke as well as excel-
lent reliability and concurrent validity in patients with
subacute stroke (12, 17). However, the Mini-BESTest
had a floor effect in patients with subacute stroke who
had low functional ability during day 27 through day
94 (12), limiting its use in this group of patients.
The S-BESTest is our newly developed short ver-
sion of the BESTest for patients with subacute stroke
aiming to reduce the assessment time and floor effect
while retaining all domains of the BESTest. Using
Rasch analysis partial credit model to reduce the
items (18), the S-BESTest contains 13 items (total 39
points) using a similar scoring system as the original
BESTest (see Table I for comparison of the original
and 3 shortened BESTest). The construct validity
of the S-BESTest was confirmed by performing hy-
pothesis testing on the known group (19), but other
psychometric properties of the S-BESTest, such as
reliability, validity, floor and ceiling effect, and re-
sponsiveness have not been assessed. Therefore, it is
unclear which short version of the BESTest is most
appropriate, in terms of having highest responsiveness
and lowest floor/ceiling effect, for assessing patients
with subacute stroke. This study, therefore, aimed
to compare the reliability, validity, floor and ceiling
effect and responsiveness of 3 shortened versions
of the BESTest (S-BESTest, the Brief-BESTest, the
www.medicaljournals.se/jrm
Mini-BESTest) and the original BESTest in patients
with subacute stroke. To reduce the learning effect and
recall bias of the assessor due to repeated scoring, the
original BESTest was administered to each patient and
scores of 3 shortened versions of the BESTest were
extrapolated from the BESTest scores.
METHODS
Participants
Twelve patients with subacute stroke were recruited from de-
partments of physical therapy at Lerdsin Hospital in Bangkok,
Thailand for assessing the reliability of the scales. The sample
size calculation was estimated from a power of 0.80 and alpha
level of 0.05. A null intraclass correlation coefficient (ICC) of
0.60 and expected correlation coefficients of 0.93 were deter-
mined by a previous study (20, 21). The inclusion criteria were:
diagnosis of a first unilateral hemispheric stroke, onset within
4 months, stable vital signs, and ability to follow instructions.
Participants were excluded if they had any neurological disorder
other than stroke, unstable epilepsy, lesion at the brainstem
involving sleep-wake and respiratory control centres or cere-
bellum, cerebral aneurysm, visual problems that have not been
resolved with glasses, and cognitive impairment as measured by
the Mini-Mental State Examination (MMSE score ≤23) (22, 23).
Another 70 patients were recruited at the same hospital for
the assessment of validity and responsiveness using similar
inclusion and exclusion criteria. Sample size calculation was
based on a power of 0.80, alpha level of 0.05, correlation coef-
ficient (r) of 0.78 and an expected correlation coefficient of 0.8
(21). Since the level of functional ability influences the reco-
very process, to ensure that this study represents sufficient low
and high level of functional ability, the lower extremity motor
function domain of the Fugl-Meyer Assessment (FM-LE) was
used to classify the subjects into 2 functional level groups (35
patients in each group) for recruitment purposes. A FM-LE score
of 0–14 was classified as low functional ability and a score of
higher than 14 was classified as high functional ability (24).
The Institutional Review Board of Lerdsin Hospital (number
0306/13/127) approved the study protocol and all patients gave
written consent prior to participation.
Data collection
Prior to the tests, all raters were first trained to score healthy
subjects, and then patients with stroke. The reliability was
assessed through videotape rating to ensure consistency of
performance and reduce the error from movement variability.
Validation for using the videotapes was first determined by one
physical therapist who had 10 years of experience in stroke
rehabilitation. This rater scored the patient’s performance both
at the time of the test and also at 7 days later from videotape to
confirm that the results from concurrent scoring and videotape
scoring were not different. Intra-rater and inter-rater reliability
were later assessed using 5 physical therapists: 3 from Lerdsin
Hospital, with stroke rehabilitation experience of 1, 5, and
10 years, respectively and 2 PhD physical therapy students.
5 raters scored each patient’s performance from videotape on
2 separate occasions within 7 days. Each rater did not discuss
scoring among themselves and scored the patients’ performance
on separate scoring worksheets on each occasion. Intra-rater
reliability of total scores and domain scores were determined