TY - JOUR
T1 - Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits
AU - Zeltzer, Dan
AU - Kugler, Zehavi
AU - Hayat, Lior
AU - Brufman, Tamar
AU - Ber, Ran Ilan
AU - Leibovich, Keren
AU - Beer, Tom
AU - Frank, Ilan
AU - Shaul, Ran
AU - Goldzweig, Caroline
AU - Pevnick, Joshua
N1 - Publisher Copyright:
© 2025 American College of Physicians.
PY - 2025/4
Y1 - 2025/4
N2 - Background: Whether artificial intelligence (AI) assistance is associated with quality of care is uncertain. Objective: To compare initial AI recommendations with final recommendations of physicians who had access to the AI recommendations and may or may not have viewed them. Design: Retrospective cohort study. Setting: Cedars-Sinai Connect, an AI-assisted virtual urgent care clinic with intake questions via structured chat. When confidence is sufficient, AI presents diagnosis and management recommendations (prescriptions, laboratory tests, and referrals). Patients: 461 physician-managed visits with AI recommendations of sufficient confidence and complete medical records for adults with respiratory, urinary, vaginal, eye, or dental symptoms from 12 June to 14 July 2024. Measurements: Concordance of diagnosis and management recommendations of initial AI recommendations and final physician recommendations. Physician adjudicators scored all nonconcordant and a sample of concordant recommendations as optimal, reasonable, inadequate, or potentially harmful. Results: Initial AI and final physician recommendations were concordant for 262 visits (56.8%). Among the 461 weighted visits, AI recommendations were more frequently rated as optimal (77.1% [95% CI, 72.7% to 80.9%]) compared with treating physician decisions (67.1% [CI, 62.9% to 71.1%]). Quality scores were equal in 67.9% (CI, 64.8% to 70.9%) of cases, better for AI in 20.8% (CI, 17.8% to 24.0%), and better for treating physicians in 11.3% (CI, 9.0% to 14.2%), respectively. Limitations: Single-center retrospective study. Adjudicators were not blinded to the source of recommendations. It is unknown whether physicians viewed AI recommendations. Conclusion: When AI and physician recommendations differed, AI recommendations were more often rated better quality. Findings suggest that AI performed better in identifying critical red flags and supporting guideline-adherent care, whereas physicians were better at adapting recommendations to changing information during consultations. Thus, AI may have a role in assisting physician decision making in virtual urgent care.
AB - Background: Whether artificial intelligence (AI) assistance is associated with quality of care is uncertain. Objective: To compare initial AI recommendations with final recommendations of physicians who had access to the AI recommendations and may or may not have viewed them. Design: Retrospective cohort study. Setting: Cedars-Sinai Connect, an AI-assisted virtual urgent care clinic with intake questions via structured chat. When confidence is sufficient, AI presents diagnosis and management recommendations (prescriptions, laboratory tests, and referrals). Patients: 461 physician-managed visits with AI recommendations of sufficient confidence and complete medical records for adults with respiratory, urinary, vaginal, eye, or dental symptoms from 12 June to 14 July 2024. Measurements: Concordance of diagnosis and management recommendations of initial AI recommendations and final physician recommendations. Physician adjudicators scored all nonconcordant and a sample of concordant recommendations as optimal, reasonable, inadequate, or potentially harmful. Results: Initial AI and final physician recommendations were concordant for 262 visits (56.8%). Among the 461 weighted visits, AI recommendations were more frequently rated as optimal (77.1% [95% CI, 72.7% to 80.9%]) compared with treating physician decisions (67.1% [CI, 62.9% to 71.1%]). Quality scores were equal in 67.9% (CI, 64.8% to 70.9%) of cases, better for AI in 20.8% (CI, 17.8% to 24.0%), and better for treating physicians in 11.3% (CI, 9.0% to 14.2%), respectively. Limitations: Single-center retrospective study. Adjudicators were not blinded to the source of recommendations. It is unknown whether physicians viewed AI recommendations. Conclusion: When AI and physician recommendations differed, AI recommendations were more often rated better quality. Findings suggest that AI performed better in identifying critical red flags and supporting guideline-adherent care, whereas physicians were better at adapting recommendations to changing information during consultations. Thus, AI may have a role in assisting physician decision making in virtual urgent care.
UR - http://www.scopus.com/inward/record.url?scp=105003285591&partnerID=8YFLogxK
U2 - 10.7326/ANNALS-24-03283
DO - 10.7326/ANNALS-24-03283
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 40183679
AN - SCOPUS:105003285591
SN - 0003-4819
VL - 178
SP - 498
EP - 506
JO - Annals of Internal Medicine
JF - Annals of Internal Medicine
IS - 4
ER -