Two posters presented at ASHP 2023 Midyear examined how accurate answers were from the artificial intelligence chatbot.
In November 2022, the artificial intelligence research organization OpenAI released its chatbot ChatGPT. The system is based on a large language model that produces responses when given a natural language input. The novel technology has been the subject of much discussion since its debut due to its potential impact on a wide range of industries, including the healthcare sector.
Two posters presented at the American Society of Health-System Pharmacists 2023 Midyear Clinical Meeting and Exhibition looked into how accurate ChatGPT is by examining it’s responses to questions about drug information.
In the first study, a group of drug information specialists from Long Island University conduced a study assessing the ability of ChatGPT to provide accurate and complete responses to questions about drug information. The team collected 45 questions from January 2022 through April 2023 and searched professional literature for answers in May 2023. The responses were verified by a second investigator and served as the standard for ChatGPT responses.1
An independent checklist was then created to compare investigator responses with those from ChatGPT. The questions were entered into the chatbot with another asking for references. The main study outcome was ChatGPT’s ability to provide a satisfactory response.
Investigators found that ChatGPT answered 10 out of 39 questions satisfactorily. The 3 most common questions were on therapeutics, compounding/formulation, and dosage. Responses from ChatGPT included a lack of accuracy and a lack of completeness. Out of 29 questions that were not answered satisfactorily, the chatbot did not have a response for 11 of them, instead providing general background information.
Additionally, ChatGPT only provided references in 8 of the 28 questions that received a direct response. However, investigators discovered that each of the references that the chatbot listed did not actually exist.
“In this study, ChatGPT was unable to provide an accurate and complete response to most questions presented to a drug information service,” the authors concluded. “Healthcare professionals and consumers should be cautious of using ChatGPT to obtain medication-related information.”
In the second study, investigators from Iwate Medical University in Japan conducted a cross-sectional, observational study to compare drug information on side effects from ChatGPT with Lexicomp. A total of 30 FDA approved drugs were selected for the study and were inputted into ChatGPT from April to June 2023.2
Each drug was entered into the chatbot using the same question; “What are the most common side effects of the selected drug?” The response was then compared to drug information listed with 1% or more frequency in Lexicomp. The ChatGPT answers were categorized as “accurate”, “partly accurate”, or “inaccurate”.
Investigators found that 26 of the 30 responses from ChatGPT were inaccurate, 2 were partly accurate, and 2 were accurate. Some of the common drug side effects listed in Lexicomp were not included in the chatbot’s responses. However, ChatGPT did note in its responses to consult with a healthcare provider for further guidance.
Additionally, the responses from ChatGPT were found to be written at a basic level that were easy to understand.
Study limitations included that the drug side effects were only evaluated by 1 pharmacist. The authors noted that a larger and more controlled study could produce a more generalizable conclusion.
“While ChatGPT produced some common side effects that matched Lexicomp’s information, the majority of responses were inaccurate,” the authors concluded. “ChatGPT needs to be trained on more accurate and controlled dataset to produce more accuracy, and to become a more viable drug information and medical education tool for patients and healthcare professionals in the future.”