Quality and Accountability of Large Language Models (LLMs) in Healthcare in Low- and Middle-Income Countries (LMIC): A Simulated Patient Study using ChatGPT
Using simulated patients to mimic nine established non-communicable and infectious diseases over 27 trials, we assess ChatGPT's effectiveness and reliability in diagnosing and treating common diseases in low- and middle-income countries. We find ChatGPT's performance varied within a single disease, despite a high level of accuracy in both correct diagnosis (74.1%) and medication prescription (84.5%). Additionally, ChatGPT recommended a concerning level of unnecessary or harmful medications (85.2%) even with correct diagnoses. Finally, ChatGPT performed better in managing non-communicable diseases compared to infectious ones. These results highlight the need for cautious AI integration in healthcare systems to ensure quality and safety.
Year of publication: |
2024
|
---|---|
Authors: | Si, Yafei ; Yang, Yuyi ; Wang, Xi ; An, Ruopeng ; Zu, Jiaqi ; Chen, Xi ; Fan, Xiaojing ; Gong, Sen |
Publisher: |
Essen : Global Labor Organization (GLO) |
Subject: | ChatGPT | Large Language Models | Generative AI | Simulated Patient | Healthcare | Quality | Safety | Low- and Middle-Income Countries |
Saved in:
freely available
Series: | GLO Discussion Paper ; 1472 |
---|---|
Type of publication: | Book / Working Paper |
Type of publication (narrower categories): | Working Paper |
Language: | English |
Other identifiers: | 1902965744 [GVK] RePEc:zbw:glodps:1472 [RePEc] |
Classification: | C0 - Mathematical and Quantitative Methods. General ; I10 - Health. General ; I11 - Analysis of Health Care Markets ; C90 - Design of Experiments. General |
Source: |
Persistent link: https://www.econbiz.de/10014634500