OSTs (for each)
Table II. Possible combinations of index test/reference standard/target condition for meta-analyses
Examination of the LHBT in the clinical setting
Index test Reference standard LHBT pathology identified
HRUS Surgery (open or arthroscopy) One of:
Tendinopathy
Dislocation
Rupture – partial
Rupture – total
Effusion (bicipital recess)
One pathology which each OST is designed to detect (see Appendix I):
SLAP lesion
Tendinopathy
Proximal LHBT pathology other than SLAP (dislocation, rupture, tendinopathy)
Another diagnostic imaging modality
Any reference standard
Surgery (open or arthroscopy)
HRUS
MR imaging/arthrography
Any reference standard
481
HRUS: high-resolution ultrasound; OSTs: orthopaedic special tests; MR: magnetic resonance; SLAP: superior labrum anterior and posterior; LHBT: long head
of the biceps tendon.
participants they recruit and the test that they evaluate (19). In
that respect, data were combined where studies measured the
accuracy of the same index test for the diagnosis of the same
LHBT pathology: (i) according to the same reference standard;
and (ii) according to all reference standards. Meta-analysis
tools were used when a minimum of 4 primary studies were
identified (Table II) (20). Where a limited number of studies
prevented the use of meta-analysis tools, only sensitivity (Sn)
and specificity (Sp) estimates are presented from each study,
together with forest plots.
Meta-analyses were conducted using the approach developed
by Rutter & Gatsonis with the V3.3.3 of R statistical software
(http://www.r-project.org/) (21). The HSROC package was used
to calculate overall pooled estimates of the included diagnostic
studies taking into account the between-study and within-study
variability. This routine, based on Bayesian statistics, estimates
the overall sensitivity (Sn) and specificity (Sp) for group of
studies and produces a receiver operating characteristic (ROC)
curve with credible interval and a 95% prediction region. The
classical confidence interval (CI) presumes that differences
in Sn and Sp between studies are caused only by a statistical
instability related to sampling or measurement errors. All es-
timates would turn around a unique value of Sn and a unique
value of Sp. In reality, for the same technique, Sn and Sp may
vary in time, with different populations, with different opera-
tors or any other relevant conditions that change the nature of
the test. Across different conditions, Sn and Sp could fluctuate
among a range of values that reflect a change in reality rather
than a statistical instability. The credible intervals delimit how
Sn and Sp could fluctuate for reasons other than sampling or
measurement errors. In this context, the CI adds to the credible
interval the uncertainty caused by sampling and measurement
errors. The credible intervals are narrower than the CI. The
prediction region is defined by pairing the CI with the credible
interval. Heterogeneity was explored graphically using forest
plots. Positive (LR+) and negative (LR–) likelihood ratios were
calculated from the overall Sn and Sp. However, confidence and
credible intervals could not be calculated for likelihood ratios.
Studies with cells containing zero in the 2 × 2 table lead to
statistical model instabilities. A continuity correction, consisting
of a small positive number (0.5 as suggested in the literature)
was then added to the observed frequency (20).
For SLAP lesions, because the degenerative fraying of the
SLAP I lesion is often considered a normal variant and asymp-
tomatic, type II–IV and type I–IV lesions studies were isolated
(22). The type II–IV group comprised studies either designed
to assess the diagnosis of SLAP II–IV lesions or where only
SLAP II–IV lesions were ascertained by the reference standard.
RESULTS
Search results
Searches resulted in 777 citations (duplicates remo-
ved). Twenty-eight articles were accepted for the
review after full-text screen. Fourteen articles were
obtained by scrutiny of the reference lists of reviews
and primary studies. Of the 42 eligible studies, 30 were
included in the analysis of the review (8 for HRUS, 22
for OSTs; Fig. 1, Table III).
Methodological quality of included studies
For the risk of bias assessment, inter-rater agreement
was excellent (Gwet’s AC1 of 0.85). The overall
Records screened on basis
of title/abstracts (duplicates
removed
n= 777
Records identified from
reference lists of articles
n=14
Full-text of potentially
relevant studies retrieved
n=101
Full-text elligible article
n=28
Records excluded
n=676
Excluded studies and reasons
– Studies did not address
LHBT (25)
– Reviews (34)
– Highly selected population
(1)
– Studies with lacking or
incomplete data (4)
– Not English or French (2)
– No index test (1)
– Index test is a cluster (1)
– Reference standard inade-
quate (1)
– Target condition inadequate
(4)
Total 73
Potentially appropriate
studies to be included in
analysis of review
n=42
Studies included in
analysis of review
– By source: MEDLINE
(15), CIHAHL (4), EMBASE
(4), References (7)
– By index test: HRUS (8),
OSTs (22)
Excluded studies and reasons
– Studies had highly selected
population (2)
– Studies had discrepancy in
2X2 tables or between text
and tables (2)
– Studies had 100% prevalan-
ce (1)
– Study lacking data to draw
2X2 tables (6)
– Study was retrospective (1)
Total 12
n=30
Fig. 1. Flow diagram of the bibliographic search. HRUS: high-resolution ultrasound;
OSTs: orthopaedic special tests.
J Rehabil Med 51, 2019