Warning: fopen(/home/virtual/colon/journal/upload/ip_log/ip_log_2025-07.txt): failed to open stream: Permission denied in /home/virtual/lib/view_data.php on line 95 Warning: fwrite() expects parameter 1 to be resource, boolean given in /home/virtual/lib/view_data.php on line 96
Department of Parasitology and Institute of Medical Education, Hallym University College of Medicine, Chuncheon, Korea
© 2025 The Korean Society of Coloproctology
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Conflict of interest
No potential conflict of interest relevant to this article was reported.
Funding
This study was supported by the Hallym University Research Fund 2023 (No. HRF-202310-001).
Study | Country | Chatbot type | Disease entity | Question | Rater | Accuracy | Interpretation |
---|---|---|---|---|---|---|---|
Gravina et al. [10] (2024) | Italy | ChatGPT (GPT-3.5, OpenAI) | IBD | 10 Items (a group of IBD-expert physicians retrieved a list of 10 questions most frequently asked by patients with IBD) | Authors | No quantitative data | Not enough information for patients |
Beaulieu-Jones et al. [11] (2024) | USA, Taiwan | GPT-4 (OpenAI) | Surgical knowledge | 167 SCORE and 112 Data-B questions from the USA in multiple-choice and open-ended questions | Correct answers determined | Multiple choice: | It is unclear whether LLMs such as ChatGPT can safely assist clinicians in providing care |
SCORE, 71.3% | |||||||
Data-B, 67.9% | |||||||
Open-ended: | |||||||
SCORE, 47.9% | |||||||
Data-B, 66.1% | |||||||
Cankurtaran et al. [12] (2023) | Türkiye | ChatGPT (OpenAI) | IBD | 20 Questions by a committee of 4 gastroenterologists | 2 Independent gastroenterology experts | Crohn disease, 4.70±1.26 (scale, 3–7) | ChatGPT still has some limitations and deficiencies |
Ulcerative colitis, 4.40±1.21 (scale, 3–7) | |||||||
Kerbage et al. [13] (2024) | USA | GPT-4 (OpenAI) | Irritable bowel syndrome, IBD, colonoscopy, and colorectal cancer screening | 30 Frequently asked questions by patients | 3 Expert gastroenterologists | Acceptable rate of 84% accuracy | The authors urge caution in relying on ChatGPT for clinical decision-making or as a reference source |
Mukherjee et al. [14] (2023) | USA | ChatGPT (OpenAI) | Colon cancer | 12 Items of the AGA’s recommendations for follow-up after colonoscopy and polypectomy | 4 Adjudicators | Only 1 out of 12 questions was answered 100% appropriately for patients | Future renditions will be able to address nuanced queries with increased precision, serving as a readily available resource for GI education |
Choo et al. [15] (2024) | Korea | ChatGPT (OpenAI) | Colon cancer | Treatment recommendations made by ChatGPT for 30 stage IV, recurrent, synchronous colorectal cancer patients | Authors | The concordance rate between ChatGPT and the MDT was 86.7% | The ability of ChatGPT to understand complex stage IV, recurrent, and synchronous colorectal cancer cases itself is an impressive feat |
Janopaul-Naylor et al. [16] (2024) | USA | ChatGPT (OpenAI), Bing (Microsoft Corp) | Colon cancer | 117 Questions in a subset of the ACS's recommended "Questions to Ask About Your Cancer" | Expert panel using the validated DISCERN criteria | ChatGPT vs. Bing for colorectal cancer (range, 1–5), 3.8 vs. 3.0 (P<0.001) | The findings suggest a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families |
Levartovsky et al. [17] (2023) | Israel | ChatGPT (OpenAI) | Ulcerative colitis | 20 Cases with disease severity using TrueLove and Witts criteria and the necessity of hospitalization for patients with ulcerative colitis | Gastroenterologist | 80% Accuracy | ChatGPT could serve as a clinical decision support tool in assessing acute ulcerative colitis, functioning as an adjunct to clinical judgment |
Barash et al. [18] (2023) | Israel | GPT-4 (OpenAI) | Small bowel obstruction, acute cholecystitis, acute appendicitis, diverticulitis | 40 Cases of clinical notes from the ED input as prompts, with a request for an imaging recommendation | 2 Independent radiologists | Small bowel obstruction (acute), 50% | LLMs may improve radiology referral quality |
Small bowel obstruction (indolent), 100% | |||||||
Acute cholecystitis, 100% | |||||||
Acute appendicitis, 100% | |||||||
Diverticulitis, 100% | |||||||
Emile et al. [19] (2023) | USA | ChatGPT (OpenAI) | Colon cancer | 38 Questions based on authors’ clinical experience and patient information handouts from the ASCRS | 1–3 Experts | 87% Deemed appropriate | ChatGPT may become a popular educational and informative tool |
This study | Korea | GPT-4 (OpenAI), Gemini (Google), Bing (Microsoft Corp), Wrtn (Wrtn Technologies) | Colon cancer | 10 Questions regarding cancer information provided by Asan Medical Center | 2 Expert colorectal surgeons | Average score (maximum, 10): | More accuracy of generative AI platforms is needed to be used by patients or their families |
GPT-4, 5.5 | |||||||
Gemini, 5,5 | |||||||
Bing, 5 | |||||||
Wrtn, 6 |
AI, artificial intelligence; IBD, inflammatory bowel disease; SCORE, Surgical Council on Resident Education; LLM, large language model; AGA, American Gastroenterological Association; GI, gastrointestinal; MDT, multidisciplinary tumor; ACS, American Cancer Society; ED, emergency department; ASCRS, American Society of Colon and Rectal Surgeons.
Question | Sum | Rater A |
Rater B |
||||||
---|---|---|---|---|---|---|---|---|---|
GPT-4 | Gemini | Bing | Wrtn | GPT-4 | Gemini | Bing | Wrtn | ||
1. During a routine health screening, if bleeding is detected in a fecal occult blood test, what is the probability that it indicates colon cancer? | 8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
2. At what age should Korean men begin receiving regular colonoscopy screenings? | 4 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 |
3. Approximately how many years does it take for precancerous polyps to develop into colon cancer? | 6 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
4. Please share 3 dietary recommendations for preventing colon cancer. | 8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
5. Why does colon cancer most frequently occur in the sigmoid colon and rectum? | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
6. For a 55-year-old Korean man diagnosed with stage III colon cancer, what is the 5-year survival rate following surgery? | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
7. If a 55-year-old Korean man is diagnosed with stage III colon cancer, should his 21-year-old daughter receive annual colonoscopy screenings? | 6 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 |
8. For a 55-year-old Korean man with stage III colon cancer, is robotic surgery superior to laparoscopic or endoscopic resection in terms of prognosis? | 8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
9. After surgery for stage III colon cancer in a 55-year-old Korean man, is targeted therapy recommended? | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
10. In 2022, what were the respective rankings of colon cancer mortality rates among Korean men and women compared to other cancer types? | 2 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
Total | 44 | 6 | 6 | 5 | 6 | 5 | 5 | 5 | 6 |
Study | Country | Chatbot type | Disease entity | Question | Rater | Accuracy | Interpretation |
---|---|---|---|---|---|---|---|
Gravina et al. [10] (2024) | Italy | ChatGPT (GPT-3.5, OpenAI) | IBD | 10 Items (a group of IBD-expert physicians retrieved a list of 10 questions most frequently asked by patients with IBD) | Authors | No quantitative data | Not enough information for patients |
Beaulieu-Jones et al. [11] (2024) | USA, Taiwan | GPT-4 (OpenAI) | Surgical knowledge | 167 SCORE and 112 Data-B questions from the USA in multiple-choice and open-ended questions | Correct answers determined | Multiple choice: | It is unclear whether LLMs such as ChatGPT can safely assist clinicians in providing care |
SCORE, 71.3% | |||||||
Data-B, 67.9% | |||||||
Open-ended: | |||||||
SCORE, 47.9% | |||||||
Data-B, 66.1% | |||||||
Cankurtaran et al. [12] (2023) | Türkiye | ChatGPT (OpenAI) | IBD | 20 Questions by a committee of 4 gastroenterologists | 2 Independent gastroenterology experts | Crohn disease, 4.70±1.26 (scale, 3–7) | ChatGPT still has some limitations and deficiencies |
Ulcerative colitis, 4.40±1.21 (scale, 3–7) | |||||||
Kerbage et al. [13] (2024) | USA | GPT-4 (OpenAI) | Irritable bowel syndrome, IBD, colonoscopy, and colorectal cancer screening | 30 Frequently asked questions by patients | 3 Expert gastroenterologists | Acceptable rate of 84% accuracy | The authors urge caution in relying on ChatGPT for clinical decision-making or as a reference source |
Mukherjee et al. [14] (2023) | USA | ChatGPT (OpenAI) | Colon cancer | 12 Items of the AGA’s recommendations for follow-up after colonoscopy and polypectomy | 4 Adjudicators | Only 1 out of 12 questions was answered 100% appropriately for patients | Future renditions will be able to address nuanced queries with increased precision, serving as a readily available resource for GI education |
Choo et al. [15] (2024) | Korea | ChatGPT (OpenAI) | Colon cancer | Treatment recommendations made by ChatGPT for 30 stage IV, recurrent, synchronous colorectal cancer patients | Authors | The concordance rate between ChatGPT and the MDT was 86.7% | The ability of ChatGPT to understand complex stage IV, recurrent, and synchronous colorectal cancer cases itself is an impressive feat |
Janopaul-Naylor et al. [16] (2024) | USA | ChatGPT (OpenAI), Bing (Microsoft Corp) | Colon cancer | 117 Questions in a subset of the ACS's recommended "Questions to Ask About Your Cancer" | Expert panel using the validated DISCERN criteria | ChatGPT vs. Bing for colorectal cancer (range, 1–5), 3.8 vs. 3.0 (P<0.001) | The findings suggest a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families |
Levartovsky et al. [17] (2023) | Israel | ChatGPT (OpenAI) | Ulcerative colitis | 20 Cases with disease severity using TrueLove and Witts criteria and the necessity of hospitalization for patients with ulcerative colitis | Gastroenterologist | 80% Accuracy | ChatGPT could serve as a clinical decision support tool in assessing acute ulcerative colitis, functioning as an adjunct to clinical judgment |
Barash et al. [18] (2023) | Israel | GPT-4 (OpenAI) | Small bowel obstruction, acute cholecystitis, acute appendicitis, diverticulitis | 40 Cases of clinical notes from the ED input as prompts, with a request for an imaging recommendation | 2 Independent radiologists | Small bowel obstruction (acute), 50% | LLMs may improve radiology referral quality |
Small bowel obstruction (indolent), 100% | |||||||
Acute cholecystitis, 100% | |||||||
Acute appendicitis, 100% | |||||||
Diverticulitis, 100% | |||||||
Emile et al. [19] (2023) | USA | ChatGPT (OpenAI) | Colon cancer | 38 Questions based on authors’ clinical experience and patient information handouts from the ASCRS | 1–3 Experts | 87% Deemed appropriate | ChatGPT may become a popular educational and informative tool |
This study | Korea | GPT-4 (OpenAI), Gemini (Google), Bing (Microsoft Corp), Wrtn (Wrtn Technologies) | Colon cancer | 10 Questions regarding cancer information provided by Asan Medical Center | 2 Expert colorectal surgeons | Average score (maximum, 10): | More accuracy of generative AI platforms is needed to be used by patients or their families |
GPT-4, 5.5 | |||||||
Gemini, 5,5 | |||||||
Bing, 5 | |||||||
Wrtn, 6 |
Responses were scored as 1 if adequate and 0 if insufficient or inadequate to the public. GPT-4, OpenAI; Gemini, Google; Bing, Microsoft Corp; Wrtn, Wrtn Technologies.
AI, artificial intelligence; IBD, inflammatory bowel disease; SCORE, Surgical Council on Resident Education; LLM, large language model; AGA, American Gastroenterological Association; GI, gastrointestinal; MDT, multidisciplinary tumor; ACS, American Cancer Society; ED, emergency department; ASCRS, American Society of Colon and Rectal Surgeons.