Diagnostic Accuracy and Agreement Between AI and Clinicians in Orthodontic 3D Model Analysis

dc.contributor.authorBor, Sabahattin
dc.contributor.authorOguz, Firat
dc.contributor.authorKhanmohammadi, Ayla
dc.date.accessioned2026-04-04T13:31:13Z
dc.date.available2026-04-04T13:31:13Z
dc.date.issued2025
dc.departmentİnönü Üniversitesi
dc.description.abstractBackground: Artificial intelligence (AI) is increasingly integrated into orthodontic workflows, including digital model analysis modules embedded in orthodontic software. While these systems offer efficiency and automation, the accuracy and clinical reliability of AI-generated measurements and diagnostic assessments remain unclear. Therefore, to use AI systems safely and effectively in clinical orthodontics, it is important to check their results by comparing them with those of experienced orthodontists. Methods: Digital models of 48 patients were analyzed by the Orthodontist group and two AI platforms: Titan (full) and SoftSmile (Bolton only). Three orthodontists independently measured all variables using 3Shape OrthoAnalyzer, and group means were used for comparison. A subset of models was reanalyzed after two weeks to assess consistency. Data distribution was evaluated, and appropriate statistical tests were applied. Reliability was assessed using intraclass correlation coefficients (ICC) and Cohen's kappa. Results: Almost perfect agreement was observed between the orthodontists and Titan AI in molar classification (kappa = 0.955 right, kappa = 0.900 left; p < 0.001), with perfect agreement reported across all groups-including between the orthodontists themselves-for Angle classification (kappa = 1.00). In anterior and overall Bolton analyses, no meaningful agreement was found between the orthodontists and AI platforms. However, in a subset of patients where all three methods identified the tooth size discrepancy in the same arch (either maxilla or mandible), no significant differences were found in anterior (p = 0.226) or overall Bolton values (p = 0.795). Overjet, overbite, and space analysis values showed significant differences between the orthodontist and Titan groups (p < 0.001). ICC analysis indicated good to excellent intra- and inter-rater reliability within the orthodontist group (>= 0.77), while both AI systems demonstrated excellent internal consistency, with ICC values exceeding 0.95. Conclusions: AI-based platforms showed high agreement with orthodontists only in Angle classification. While their performance in Bolton analysis was limited, significant differences were observed in other linear measurements, indicating the need for further refinement before clinical use.
dc.identifier.doi10.3390/app15147786
dc.identifier.issn2076-3417
dc.identifier.issue14
dc.identifier.orcid0000-0001-6040-3790
dc.identifier.orcid0000-0001-5463-0057
dc.identifier.scopus2-s2.0-105011873216
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/app15147786
dc.identifier.urihttps://hdl.handle.net/11616/108659
dc.identifier.volume15
dc.identifier.wosWOS:001550981200001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofApplied Sciences-Basel
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20250329
dc.subjectdigital orthodontics
dc.subjectmodel analysis
dc.subjectAI-based diagnosis
dc.subjecttitan dental design
dc.subjectSoftSmile
dc.titleDiagnostic Accuracy and Agreement Between AI and Clinicians in Orthodontic 3D Model Analysis
dc.typeArticle

Dosyalar