Home
Abstract
My Abstract(s)
Login
ePosters
Back
Final Presentation Format
Eposter Presentation
Eposter in PDF Format
Accept format: PDF. The file size should not be more than 5MB
Eposter in Image Format
Accept format: PNG/JPG/WEBP. The file size should not be more than 2MB
Presentation Date / Time
Submission Status
Withdrawn
Abstract
Abstract Title
The In-depth Comparative Analysis of Four Large Language AI Models for Risk Assessment and Information Retrieval from Multi-Modality Prostate Cancer Work-up Reports
Presentation Type
Moderated Poster Abstract
Manuscript Type
Clinical Research
Abstract Category *
AI in Urology
Author's Information
Number of Authors (including submitting/presenting author) *
4
No more than 10 authors can be listed (as per the Good Publication Practice (GPP) Guidelines).
Please ensure the authors are listed in the right order.
Country
Taiwan
Co-author 1
Yen-Chun Lin u102001412@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan *
Co-author 2
Lun-Hsiang Yuan lunhsiang.yuan@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan
Co-author 3
Chung-You Tsai pgtsai@gmail.com Far Eastern Memorial Hospital Division of Urology, Department of Surgery New Taipei Taiwan
Co-author 4
Shi-Wei Huang will6438.huang@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan
Co-author 5
Co-author 6
Co-author 7
Co-author 8
Co-author 9
Co-author 10
Co-author 11
Co-author 12
Co-author 13
Co-author 14
Co-author 15
Co-author 16
Co-author 17
Co-author 18
Co-author 19
Co-author 20
Abstract Content
Introduction
This study compares four general-purpose large language models (LLMs) in prostate cancer information retrieval (IR) and risk assessment (RA), highlighting performance differences across multifaceted clinical tasks.
Materials and Methods
We compares the performance of four LLMs (ChatGPT-4-turbo, Claude-3-opus, Gemini-pro-1.0, ChatGPT-3.5-turbo) on three RA tasks (LATITUDE, CHAARTED, TwNHI) and seven IR tasks. The study using simulated text reports from computed tomography, magnetic resonance imaging, bone scans, and biopsy pathology on stage IV PC patients. The tasks covered TNM staging, detection and quantification of bone and visceral metastases, offering a broad evaluation of the models' ability to process diverse clinical data. We used zero-shot chain-of-thought prompting via API to query the LLMs with multi-modal reports. The models' performances were assessed through a consensus standard set by three adjudicators, using repeated single-query methods and ensemble voting, and evaluated based on 6 outcome metrics.
Results
In a simulated analysis of 350 Stage IV PC patient reports, 115(32.8%) as LATITUDE high risk, 128(36.5%) as CHAARTED high volume, and 94(27.0%) as TwNHI high risk. Ensemble voting, based on three repeated single-round queries, consistently enhances accuracy with a higher likelihood of achieving non-inferior results compared to a single query. The four language models tested showed small differences in information retrieval (IR) tasks, achieving high accuracy rates (87.4%-94.2%) and consistent TNM staging results (ICC > 0.8). However, notable variations emerged in RA, with performance ranked from highest to lowest: ChatGPT- 4-turbo, Claude-3-opus, Gemini-pro-1.0, and ChatGPT-3.5-turbo. While all models showed similar IR performance, significant differences were observed in RA tasks, with ChatGPT-4-turbo outperforming the others in accuracy (90.1%, 90.7%, 91.6%) and consistency (ICC 0.86, 0.93, 0.76) across three RA tasks.Its high sensitivity and NPV also making it a tool for ruling out high-risk patients.
Conclusions
This study shows ChatGPT-4-turbo is the most effective LLM tested, excelling in RA tasks for Stage IV PC with high accuracy and consistency, highlighting its potential as a clinical decision support tool.
Keywords
Prostate cancer, large language model, risk assessment, information retrieval, clinical decision support, ChatGPT
Figure 1
https://storage.unitedwebnetwork.com/files/1237/77af9645be407bb8225abf07a7c75750.png
Figure 1 Caption
Figure 2
Figure 2 Caption
Figure 3
Figure 3 Caption
Figure 4
Figure 4 Caption
Figure 5
Figure 5 Caption
Character Count
2230
Vimeo Link
Presentation Details
Session
Date
Time
Presentation Order