Assessment of the Capability of ChatGPT-3.5 in Medical Physiology Examination in an Indian Medical School

Document Type : Original Article

Authors

1 Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India

2 Department of Physiology, Kalyan Singh Government Medical College Bulandshahr, Uttar Pradesh, India

3 Department of Physiology, Raiganj Government Medical College and Hospital, West Bengal, India

Abstract

Background: There has been increasing interest in exploring the capabilities of artificial intelligence (AI) in various fields, including education. Medical education is an area where AI can potentially have a significant impact, especially in helping students answer their customized questions. In this study, we aimed to investigate the capability of ChatGPT, a conversational AI model in generating answers to medical physiology exam questions in an Indian medical school.
Methods: This cross-sectional study was conducted in March 2023 in an Indian Medical School, Deoghar, Jharkhand, India. The first mid-semester physiology examination was taken as the reference examination. There were two long essays, five short essay questions (total mark 40), and 20 multiple-choice questions (MCQ) (total mark 10). We generated the response from ChatGPT (in March 13 version) for both essay and MCQ questions. The essay-type answer sheet was evaluated by five faculties, and the average was taken as the final score. The score of 125 students (all first-year medical students) in the examination was obtained from the departmental registery. The median score of the 125 students was compared with the score of ChatGPT using Mann-Whitney U test.
Results: The median score of 125 students in essay-type questions was 20.5 (Q1-Q3: 18-23.5) which corresponds to a median percentage of 51.25% (Q1-Q3: 45-58.75) (P=0.147). The answer generated by ChatGPT scored 21.5 (Q1-Q3: 21.5-22), which corresponds to 53.75% (Q1-Q3: 53.75-55) (P=0.125). Hence, ChatGPT scored like that of the students (P=0.4) in essay-type questions. In MCQ-type questions, ChatGPT answered 19 correctly in 20 questions (score=9.5), and this was higher than the median score of students (6) (Q1-Q3: 5-6.5) (P<0.0001).
Conclusion: ChatGPT has the potential to generate answers to medical physiology examination questions. It has a higher capability to solve MCQ questions than essay-type ones. Although ChatGPT was able to provide answers that had the quality to pass the examination, the capability of generating high-quality answers for educational purposes is yet to be achieved. Hence, its usage in medical education for teaching and learning purposes is yet to be explored.

Keywords


  1. Civaner MM, Uncu Y, Bulut F, Chalil EG, Tatli A. Artificial intelligence in medical education: a cross-sectional needs assessment. BMC Med Educ. 2022;22:772. PMID: 36352431 PMCID: PMC9646274. doi: 10.1186/s12909-022-03852-3.
  2. Tlili A, Shehata B, Adarkwah MA, Bozkurt A, Hickey DT, Huang R, et al. What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education. Smart Learn Environ. 2023;10:15. doi: 10.1186/s40561-023-00237-x.
  3. Taecharungroj V. “What Can ChatGPT Do?” Analyzing Early Reactions to the Innovative AI Chatbot on Twitter. Big Data and Cognitive Computing. 2023; 7(1):35. doi: 10.3390/bdcc7010035
  4. Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ. 2023;9:e45312. PMID: 36753318 PMCID: PMC9947764 doi: 10.2196/45312.
  5. Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023;2:e0000198. PMID: 36812645 PMCID: PMC9931230 doi: 10.1371/journal.pdig.0000198.
  6. ChatGPT fails to clear the prestigious Civil Service Examination. IndiaAI. Available from: https://indiaai.gov.in/news/chatgpt-fails-to-clear-the-prestigious-civil-service-examination (Last accessed on 22 March 2023).
  7. Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39:605-7. PMID: 36950398 PMCID: PMC10025693 .doi: 10.12669/pjms.39.2.7653.
  8. Sinha RK, Deb Roy A, Kumar N, Mondal H. Applicability of ChatGPT in Assisting to Solve Higher Order Problems in Pathology. Cureus. 2023;15:e35237. PMID: 36968864 PMCID: PMC10033699 .doi: 10.7759/cureus.35237.
  9. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6:1169595. PMID: 37215063 PMCID: PMC10192861 doi: 10.3389/frai.2023.1169595
  10. Newton PM. ChatGPT performance on MCQ-based exams 2023. Preprint. doi:10.35542/osf.io/sytu3.
  11. Huynh LM, Bonebrake BT, Schultis K, Quach A, Deibert CM. New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology. Urol Pract. 2023;10:409-15. PMID: 37276372 .doi: 10.1097/UPJ.0000000000000406.
  12. Pepple DJ, Young LE, Carroll RG. A comparison of student performance in multiple-choice and long essay questions in the MBBS stage I physiology examination at the University of the West Indies (Mona Campus). Adv Physiol Educ. 2010;34:86-9. PMID: 20522902. doi: 10.1152/advan.00087.2009.
  13. Arif TB, Munaf U, Ul-Haque I. The future of medical education and research: Is ChatGPT a blessing or blight in disguise? Med Educ Online. 2023;28:2181052. PMID: 36809073 PMCID: PMC9946299 doi: 10.1080/10872981.2023.2181052.
  14. Das D, Kumar N, Longjam L, Sinha R, Deb Roy A, Mondal H, et al. Assessing the Capability of ChatGPT in Answering First- and Second-Order Knowledge Questions on Microbiology as per Competency-Based Medical Education Curriculum. Cureus 2023;15:e36034. PMID: 37056538 PMCID: PMC10086829. doi: 10.7759/cureus.36034.
  15. Quintans-Júnior LJ, Gurgel RQ, Araújo AAS, Correia D, Martins-Filho PR. ChatGPT: the new panacea of the academic world. Rev Soc Bras Med Trop. 2023;56:e0060. PMID: 36888781 PMCID: PMC9991106 doi: 10.1590/0037-8682-0060-2023.