TY - JOUR
T1 - Can large language models assist with pediatric dosing accuracy?
AU - Levin, Chedva
AU - Orkaby, Brurya
AU - Kerner, Erika
AU - Saban, Mor
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025
Y1 - 2025
N2 - Background and Objective: Medication errors in pediatric care remain a significant healthcare challenge despite technological advancements, necessitating innovative approaches. This study aims to evaluate Large Language Models’ (LLMs) potential in reducing pediatric medication dosage calculation errors compared to experienced nurses. Methods: This cross-sectional study (June-August 2024) involved 101 nurses from pediatric and neonatal departments and three LLMs (ChatGPT-4o, Claude-3.0, Llama 3 8B). Participants completed a nine-question survey on pediatric medication calculations. Primary outcomes were accuracy and response time. Secondary measures included seniority and group membership on accuracy. Results: Significant differences (P < 0.001) were observed between nurses and LLMs. Nurses averaged 93.14 ± 9.39 accuracy. Claude-3.0 and ChatGPT-4o achieved 100 accuracy, while Llama 3 8B was 66 accurate. LLMs were faster (15.7–75.12 seconds) than nurses (1621.2 ± 8379.3 s). The Generalized Linear Model analysis revealed task performance was significantly influenced by duration (Wald χ² = 27,881.261, p < 0.001) and interaction between relative seniority and group membership (Wald χ² = 3,938.250, p < 0.001), with participants achieving a mean total grade of 91.03 (SD = 13.87). Conclusions: Claude-3.0 and ChatGPT-4o demonstrated perfect accuracy and rapid calculation capabilities, showing promise in reducing pediatric medication dosage errors. Further research is needed to explore their integration into practice. Impact: Key Message Large Language Models (LLMs) like ChatGPT-4o and Claude-3.0 demonstrate perfect accuracy and significantly faster response times in pediatric medication dosage calculations, showing potential to reduce errors and save time. Addition to Existing Literature This study provides novel insights by quantitatively comparing LLM performance with experienced nurses, contributing to the understanding of AI’s role in improving medication safety. Impact The findings emphasize the value of LLMs as supplemental tools in healthcare, particularly in high-stakes pediatric care, where they can reduce calculation errors and improve clinical efficiency.
AB - Background and Objective: Medication errors in pediatric care remain a significant healthcare challenge despite technological advancements, necessitating innovative approaches. This study aims to evaluate Large Language Models’ (LLMs) potential in reducing pediatric medication dosage calculation errors compared to experienced nurses. Methods: This cross-sectional study (June-August 2024) involved 101 nurses from pediatric and neonatal departments and three LLMs (ChatGPT-4o, Claude-3.0, Llama 3 8B). Participants completed a nine-question survey on pediatric medication calculations. Primary outcomes were accuracy and response time. Secondary measures included seniority and group membership on accuracy. Results: Significant differences (P < 0.001) were observed between nurses and LLMs. Nurses averaged 93.14 ± 9.39 accuracy. Claude-3.0 and ChatGPT-4o achieved 100 accuracy, while Llama 3 8B was 66 accurate. LLMs were faster (15.7–75.12 seconds) than nurses (1621.2 ± 8379.3 s). The Generalized Linear Model analysis revealed task performance was significantly influenced by duration (Wald χ² = 27,881.261, p < 0.001) and interaction between relative seniority and group membership (Wald χ² = 3,938.250, p < 0.001), with participants achieving a mean total grade of 91.03 (SD = 13.87). Conclusions: Claude-3.0 and ChatGPT-4o demonstrated perfect accuracy and rapid calculation capabilities, showing promise in reducing pediatric medication dosage errors. Further research is needed to explore their integration into practice. Impact: Key Message Large Language Models (LLMs) like ChatGPT-4o and Claude-3.0 demonstrate perfect accuracy and significantly faster response times in pediatric medication dosage calculations, showing potential to reduce errors and save time. Addition to Existing Literature This study provides novel insights by quantitatively comparing LLM performance with experienced nurses, contributing to the understanding of AI’s role in improving medication safety. Impact The findings emphasize the value of LLMs as supplemental tools in healthcare, particularly in high-stakes pediatric care, where they can reduce calculation errors and improve clinical efficiency.
UR - http://www.scopus.com/inward/record.url?scp=105000012104&partnerID=8YFLogxK
U2 - 10.1038/s41390-025-03980-8
DO - 10.1038/s41390-025-03980-8
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 40057653
AN - SCOPUS:105000012104
SN - 0031-3998
JO - Pediatric Research
JF - Pediatric Research
M1 - 814100
ER -