2022 Annual Meeting and Alumni Reunion Schedule 2022 AMAR Schedule | Page 38

2022 Annual Meeting and Alumni Reunion
Category : Clinical Research Candidate : Mohammad Eslami Poster #: C4
Evaluation of Deep Learning Visual Field Prediction Models for Clinical Relevance
Mohammad Eslami , Miao Zhang , Julia Kim , Dolly Chang , Yangjiani Li , Saber Kazeminasab , Mojtaba Fazli , Vishal sharma , Michael Boland , Nazlee Zebardast , Mengyu Wang , Tobias Elze
Purpose : Deep learning methods have recently been used for predicting future visual fields ( VFs ) using baseline or longitudinal VFs . In clinical practice , glaucomatous VF loss progression is a comparatively rare event . It is of particular clinical relevance if these prediction models can accurately identify patients with disease progression to aid clinicians in avoiding vision loss . Here , we evaluate two previously described models for potential biases in over- or underestimating VF changes over time .
Methods : We consider two recent studies , namely Wen et al . ( MWen ) to predict VF sensitivity and Park et al . ( MPark ) to predict total deviations . All reliable ( false negatives / positives≤30 %, fixation losses≤30 %) Humphrey 24-2 VFs from Mass . Eye and Ear glaucoma services from 1999 to 2020 were included . We re-implemented the methods and made them available to other investigators . As in the original studies , pointwise mean absolute error ( PMAE ) was used to measure model prediction accuracy . A 5-fold cross-validation scheme was utilized , and the models are additionally compared against a no-change model , i . e . the baseline VF for MWen and the last-observed VF for MPark were used as the predicted VF . The evaluation dataset included 54,373 samples from 7,472 people for MWen and 24,430 samples from 1,809 people for MPark , depending on the method ' s needs .
Results : The PMAE results (% 95 CI , MWen : 2.21-2.24 , MPark : 2.56-2.61 ) are close to the original papers . Also , the scatterplots w . r . t . mean of sensitivity for MWen and mean of deviation for MPark are considered to show predicted vs truth values . This shows that both approaches produce satisfactory outcomes well close to y = x ( predicted = truth ). But , for further investigation , more scatterplots were examined by showing the prediction ' s errors vs . actual changes . With this investigation , both methods exhibit a large error in projecting worsening cases and were not even close to the hypothetical unbiased model .
Conclusions : Our evaluation of the two VF prediction models confirms the low PMAEs reported in the original studies . However , both models underpredicted the worsening of VF loss over time . As detecting the progression of VF loss is a major motivation to obtain clinical VFs , we suggest explicitly considering this aspect in future model evaluations as well as the data characteristics .