420
N. M. Otterman et al.
quality: a range around a modal answer (maximum
3 different response categories); (ii) good quality:
a maximum of 4 different response categories with
only one expert choosing the extreme answer; (iii)
doubtful quality: the 2 highest scoring answers
are more than one response category apart; and
(iv) poor quality: broad distribution. The doubtful
and poor items were considered to be performing
poorly in the SCT.
B. Development of a web-based test
Secondly, the item-total correlation coefficient
of the subject responses was assessed, which pro-
C. Development of a scoring algorithm
vided an estimate of each item’s discriminative
capacity. A negative or low item-total correla-
1.
tion contributes minimally or not at all to the
D. Optimization of the test items
reliability of the test, although it can also reflect
Fig. 1. 2. Development of the script concordance test (SCT) in 4 phases. Flowchart showing
the heterogeneity of clinical competence or the
the phases in the development of the SCT. PTs: physical therapists.
nature of the domain tested. Therefore, it should
be carefully considered whether the items with
negative or below 0.05 item-total correlations
scenarios was enhanced by studying real-life case examples, by
should be discarded.
job shadowing the first author in different work settings across
Thirdly, the content validity as perceived by the reference panel
the continuum of stroke care. The draft vignettes were reviewed
was assessed, rated on a 5-point Likert-scale ranging from (fully)
by a development panel, consisting of 4 PTs who had participated
disagree (0) to (fully) agree (5). For each item, the percentage
in the development of the CPG Stroke. Consensus on relevance,
of the members of the reference panel who judged that this
clarity and content of the “cases” was reached in 2 e-mail rounds
item was an adequate reflection of guideline-consistent clini-
and 2 in-person consensus rounds.
cal reasoning, and was relevant for daily practice for patients
Development of a web-based test. The final version of the case
with stroke, was calculated. An arbitrary cut-off point of 65%
scenarios and items for the SCT were agreed upon by the authors
agreement or full agreement on this item was used. Percentage
and the development panel, and this version was programmed
scores below this cut-off point were considered to indicate low
in a web-based test.
content validity.
The items that performed poorly on 2 of the 3 levels of quality
Development of a scoring algorithm. To develop the scoring algo-
assessed were presented to the development panel. If 75% of
rithm, a reference panel consisting of 15 members was invited to
the panel recommended removal of this item, it was discarded.
complete the test. Members of the project group could nominate
The total score of the optimized SCT was the sum of the
PTs from their network for the reference panel. Individuals were
credits of the remaining items, expressed as a percentage of
selected if they met all of the following 4 criteria: (i) registered in
the maximum score.
the Central Quality Register for Physical Therapy; (ii) high level
of guideline knowledge and use of the CPG Stroke guideline in
clinical practice, based on their participation in the development of
Recruitment and study sample
the CPG Stroke and/or teaching a course in neurorehabilitation for
An undirected recruitment campaign, using an e-mail sent in
PTs in which the CPG Stroke was used; (iii) consensus of at least 3
February 2015 to 1,704 potential participants, was performed.
members of the project group and 3 members of the development
Post-graduate PTs (n = 728) were approached via a Dutch
panel about their level of expertise; and (iv) providing informed
national education institute for allied health professionals
consent for participation in the study. Scores for each question
(Nederlands Paramedisch Instituut). This was a sample of PTs
were computed from the answers chosen by the reference panel,
with a variety of fields of interest, as recorded by the institute,
as proposed in the AMEE guideline (12). Credit for each answer
such as sports, musculoskeletal, neurology, cardiology and
was transformed proportionally to obtain a maximum score of
oncology. Physical therapy students (n = 976) at 7 universities
1 credit for the modal answer from the reference panel for each
of applied sciences with a physical therapy programme were
item, a score of 0 credits for an answer that was not selected by
also approached. All received a reminder e-mail in March 2015.
any of the reference panel and partial credit for an answer other
After a positive response to the recruitment mail, a participant
than the modal answer. A web-based calculator developed by the
received a log-in code for the web-based SCT. After completing
University of Montreal was used to analyse the reference panel’s
the SCT, participants were assigned to 1 of 4 groups based on
response and construct the scoring algorithm (16).
specialization. Since there is no formal registry of PTs specia-
Optimization of the test items. In establishing the final version of
lizing in neurology in the Netherlands held by an institution or
an SCT, different quality assessment strategies for item optimiza-
society, the authors defined a classification based on therapist
tion have been described. A stepwise quality assessment was per-
characteristics. The first group consisted of a PTs specializing
formed based on the AMEE guideline (12) on SCT development.
in neurology who met the specialization criteria stated in the
First, the variability of the reference panel responses was as-
CPG Stroke (i.e. treatment volume of at least 5 unique stroke
sessed. The variability among the members of the reference panel
patients a year, completion of the postgraduate course on stroke
has been shown to be a key determinant of the discriminatory
rehabilitation, participation in professional development acti-
power of an SCT (12). Ideally, SCT questions produce a range of
vities in the field of stroke, and self-report of neurology being
expert responses clustered around a modal answer. Questions with
their main specialization). The second group consisted of PTs
unanimity or with a broad distribution of responses are conside-
with a self-reported focus on geriatrics or neurology who did
red to have low quality. The quality of answers was rated using
not meet all criteria for the group specializing in neurology. The
criteria based on the AMEE guideline (12), as follows: (i) high
third group consisted of PTs with other specializations (e.g. mus-
A. Development of the content of the SCT:
• setting up a test blueprint
• construction of draft vignettes based on real-life cases seen during
job shadowing of PTs in stroke rehabilitation
• review of the draft vignettes by a development panel into definitive
vignettes
www.medicaljournals.se/jrm