Skip to main content
ASOHNS ASM 2025
ASOHNS ASM 2025
Times are shown in your local time zone GMT

Development of a convolutional neural network to classify laryngomalacia from endoscopic images in infants

Verbal Presentation
Edit Your Submission
Edit

Verbal Presentation

4:02 pm

28 March 2025

Meeting Room C3.4

CONCURRENT SESSION 2F: FREE PAPERS

Disciplines

Default

Presentation Description

Institution: Children's Hospital Westmead - NSW, Australia

Background Laryngomalacia is the commonest cause of stridor in infants. Classification systems are based on assessment of laryngeal subsites responsible for supraglottic collapse. They suffer from variable inter-rater reliability and correlate poorly with clinical severity. Artificial intelligence outperforms humans in many diagnostic settings. In this pilot, we develop a convolutional neural network to classify laryngomalacia. Methods 100 de-identified recordings of laryngoscopies in infants were retrieved and mirrored to create 100 duplicates. Two paediatric otolaryngologists graded severity, epiglottis shape/position/score, arytenoid score, aryepiglottic folds, supra-arytenoid mucosa and tongue-base. We assessed intra- and inter-rater agreement using Cohen’s kappa. Successive frames were extracted in greyscale 100x100 pixels. A CNN was created using TensorFlow2.0 with rectified linear activation and four layers of convolutional 2D-neurons. 80% of recordings were used for training and 20% for validation. Results Intra-rater agreement (κ=0.78) on overall severity was substantial. Intra-rater agreement was at least substantial on all anatomical sites (κ:0.72-0.90) except aryepiglottic folds (κ=0.05). Inter-rater agreement on severity was fair (κ=0.35). Training data for tongue-base (κ<0) and epiglottis position (κ=0.20) were excluded due to poor agreement. There was otherwise fair to substantial agreement (κ:0.33-0.68). After training, the model accurately assessed laryngomalacia severity in 83.2% of test cases. Model accuracy for anatomical subsites varied from 67.6% - 84.9%. Conclusion This pilot study demonstrated a CNN that can accurately assess laryngomalacia severity and grade some anatomical subsites, with limited training. A larger dataset and high-fidelity images might improve performance. Future iterations could combine endoscopic and clinical data to predict clinical severity and likelihood of surgery without reference to classification systems designed for human reviewers.

Speakers

Authors

Authors

Dr Benjamin Worrall - , Dr Matthew Ellis - , A/Prof Alan Cheng -