Browsing by Author "Smith, Delaney"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemAutomated real-time detection of lung sliding using artificial intelligence: a prospective diagnostic accuracy study(2024) Clausdorff Fiedler, Hans Jurgen; Prager, Ross; Smith, Delaney; Wu, Derek; Dave, Chintan; Tschirhart, Jared; Wu, Ben; VanBerlo, Blake; Malthaner, Richard; Arntfield, RobertBackground: Rapid evaluation for pneumothorax is a common clinical priority. Although lung ultrasound (LUS) often is used to assess for pneumothorax, its diagnostic accuracy varies based on patient and provider factors. To enhance the performance of LUS for pulmonary pathologic features, artificial intelligence (AI)-assisted imaging has been adopted; however, the diagnostic accuracy of AI-assisted LUS (AI-LUS) deployed in real time to diagnose pneumothorax remains unknown. Research Question: In patients with suspected pneumothorax, what is the real-time diagnostic accuracy of AI-LUS to recognize the absence of lung sliding? Study Design and Methods: We performed a prospective AI-assisted diagnostic accuracy study of AI-LUS to recognize the absence of lung sliding in a convenience sample of patients with suspected pneumothorax. After calibrating the model parameters and imaging settings for bedside deployment, we prospectively evaluated its diagnostic accuracy for lung sliding compared with a reference standard of expert consensus. Results: Two hundred forty-one lung sliding evaluations were derived from 62 patients. AI-LUS showed a sensitivity of 0.921 (95% CI, 0.792-0.973), specificity of 0.802 (95% CI, 0.735-0.856), area under the receiver operating characteristic curve of 0.885 (95% CI, 0.828-0.956), and accuracy of 0.824 (95% CI, 0.766-0.870) for the diagnosis of absent lung sliding. Interpretation: In this study, real-time AI-LUS showed high sensitivity and moderate specificity to identify the absence of lung sliding. Further research to improve model performance and optimize the integration of AI-LUS into existing diagnostic pathways is warranted.
- ItemImproving the Generalizability and Performance of an Ultrasound Deep Learning Model Using Limited Multicenter Data for Lung Sliding Artifact Identification(2024) Wu, Derek; Smith, Delaney; VanBerlo, Blake; Roshankar, Amir; Lee, Hoseok; Li, Brian; Ali, Faraz; Rahman, Marwan; Basmaji, John; Tschirhart, Jared; Ford, Alex; VanBerlo, Bennett; Durvasula, Ashritha; Vannelli, Claire; Dave, Chintan; Deglint, Jason; Ho, Jordan; Chaudhary, Rushil; Clausdorff Fiedler, Hans Jurgen; Prager, Ross; Millington, Scott; Shah, Samveg; Buchanan, Brian; Arntfield, RobertDeep learning (DL) models for medical image classification frequently struggle to generalize to data from outside institutions. Additional clinical data are also rarely collected to comprehensively assess and understand model performance amongst subgroups. Following the development of a single-center model to identify the lung sliding artifact on lung ultrasound (LUS), we pursued a validation strategy using external LUS data. As annotated LUS data are relatively scarce—compared to other medical imaging data—we adopted a novel technique to optimize the use of limited external data to improve model generalizability. Externally acquired LUS data from three tertiary care centers, totaling 641 clips from 238 patients, were used to assess the baseline generalizability of our lung sliding model. We then employed our novel Threshold-Aware Accumulative Fine-Tuning (TAAFT) method to fine-tune the baseline model and determine the minimum amount of data required to achieve predefined performance goals. A subgroup analysis was also performed and Grad-CAM++ explanations were examined. The final model was fine-tuned on one-third of the external dataset to achieve 0.917 sensitivity, 0.817 specificity, and 0.920 area under the receiver operator characteristic curve (AUC) on the external validation dataset, exceeding our predefined performance goals. Subgroup analyses identified LUS characteristics that most greatly challenged the model’s performance. Grad-CAM++ saliency maps highlighted clinically relevant regions on M-mode images. We report a multicenter study that exploits limited available external data to improve the generalizability and performance of our lung sliding model while identifying poorly performing subgroups to inform future iterative improvements. This approach may contribute to efficiencies for DL researchers working with smaller quantities of external validation data.
- ItemInterrater Agreement of Physicians Identifying Lung Sliding Artifact on B-Mode And M-Mode Point of Care Ultrasound (POCUS)(2025) Prager, Ross; Clausdorff Fiedler, Hans Jurgen; Smith, Delaney; Wu, Derek; Arntfield, RobertBackground: Chest point of care ultrasound (POCUS) is a first-line diagnostic test to identify lung sliding, an important artifact to diagnose or rule out pneumothorax. Despite enthusiastic adoption of this modality, the interrater reliability forphysicians to identify lung sliding is unknown. Additionally, the relative diagnostic performance of physicians interpreting B-mode and M-mode ultrasound is unclear. We sought to determine the interrater reliability of physicians to detect lung sliding on B-mode and M-mode POCUS. Methods: We performed a cross-sectional interrater agreement study surveying acute care physicians on their interpretation of 20 B-mode and M-mode POCUS clips. Two experienced clinicians determined the reference standard diagnosis. Respondents reported their interpretation of each POCUS B-mode clip or M-mode image. The primary outcome was the interrater agreement, determined by an intra-class correlation coefficient (ICC). Results: From September to November 2023, there were 20 survey respondents. Fourteen (70%) respondents were resident physicians. Respondents were confident or very confident in their skill performing chest POCUS in 14 (70%) cases, with 19 (90%) performing chest POCUS every week or more frequently. The ICC on B-mode was 0.44 and for M-mode was 0.43, indicating moderate agreement. There were no significant differences in interrater reliability between subgroups of confidence or experience. Conclusion: There is only moderate interrater reliability between clinicians to diagnose lung sliding. Clinicians have superior accuracy on B-mode compared to M-mode clips.