Comparative Evaluation of Artificial Intelligence Models for Traumatic Dental Injuries Based on Clinical Guideline Adherence

Okan Turgut,; H Melike Bayram*; Emre Bayram

Acta Scientific Dental Sciences

Research Article Volume 9 Issue 10

Comparative Evaluation of Artificial Intelligence Models for Traumatic Dental Injuries Based on Clinical Guideline Adherence

Okan Turgut, H Melike Bayram* and Emre Bayram

Associate Professor, Tokat Gaziosmanpasa University, Faculty of Dentistry, Department of Endodontics, Tokat, Turkiye

*Corresponding Author: Huda Melike Bayram, Associate Professor, Tokat Gaziosmanpasa University, Faculty of Dentistry, Department of Endodontics, Tokat, Turkiye.

Received: August 25, 2025; Published: September 10, 2025

Reprints View PDF Related Articles

Abstract

Aim: Objective: This study aimed to evaluate the performance of three large language models (LLMs)-Grok, ChatGPT, and DeepSeekin managing traumatic dental injuries (TDIs) based on their alignment with the International Association of Dental Traumatology (IADT) 2020 clinical guidelines.

Materials and Methods: Twenty open-ended prompts were constructed to reflect real-life TDI scenarios, aligned with the 2020 IADT guidelines. Each model was queried once per prompt with no re-prompting or interaction refinement. Responses were evaluated by a trained rater using a five-criteria rubric: scientific accuracy, reliability of information, comprehensibility, level of detail, and clinical applicability. Scoring was performed using a 3-point ordinal scale. One-way ANOVA and post-hoc comparisons were applied for statistical analysis.

Results: Grok outperformed both ChatGPT and DeepSeek in scientific accuracy, detail level, and information reliability (p < 0.001). ChatGPT and DeepSeek showed relatively higher scores in comprehensibility (p = 0.007). For clinical applicability, only the Grok– DeepSeek comparison was statistically significant (p = 0.016). Total score comparisons were substantial across all model pairs (p < 0.001).

Conclusion: Large language models exhibit distinct strengths across clinical performance metrics. Grok appears more suitable for guideline-based clinical decision support in TDI management, whereas ChatGPT and DeepSeek may be better suited for educational and communicative purposes. Purpose-driven model selection and continuous performance monitoring are recommended for safe and effective clinical integration.

Keywords: Artificial Intelligence; Large Language Models; Traumatic Dental Injuries; Clinical Decision Support; Guideline Adherence; IADT Guidelines

References

Citation

Citation: H Melike Bayram., , et al. “Comparative Evaluation of Artificial Intelligence Models for Traumatic Dental Injuries Based on Clinical Guideline Adherence".Acta Scientific Dental Sciences 9.10 (2025): 09-14.

Copyright

Copyright: © 2025 H Melike Bayram., , et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

+91-91548-70046

Journal Menu

Metrics

Acceptance rate30%

Acceptance to publication20-30 days

Impact Factor1.278

Indexed In

News and Events

Publication Certificate
Authors will be provided with the Publication Certificate after their successful publication
Last Date for submission
Authors are requested to submit manuscripts on/before February 25, 2026, for the upcoming issue of 2026.

Acta Scientific Dental Sciences

Research Article Volume 9 Issue 10

Abstract

References

Citation

Copyright

+91-91548-70046

Journal Menu

Metrics

Indexed In

News and Events

Contact US

Acta Scientific Journals

Acta Scientific Dental Sciences

Research Article Volume 9 Issue 10

Abstract

References

Citation

Copyright

+91-91548-70046

Journal Menu

Metrics

Indexed In

Subscribe to our newsletter

News and Events

Contact US

Acta Scientific Journals

Follow Us On