- UAA Congress 2025

Back

Final Presentation Format

Eposter Presentation

Eposter in PDF Format

Accept format: PDF. The file size should not be more than 5MB

Eposter in Image Format

Accept format: PNG/JPG/WEBP. The file size should not be more than 2MB

Abstract

Abstract Title

The In-depth Comparative Analysis of Four Large Language AI Models for Risk Assessment and Information Retrieval from Multi-Modality Prostate Cancer Work-up Reports

Presentation Type

Moderated Poster Abstract

Manuscript Type

Clinical Research

Abstract Category *

AI in Urology

Author's Information

Number of Authors (including submitting/presenting author) *

No more than 10 authors can be listed (as per the Good Publication Practice (GPP) Guidelines).
Please ensure the authors are listed in the right order.

Co-author 1

Yen-Chun Lin u102001412@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan *

Co-author 2

Lun-Hsiang Yuan lunhsiang.yuan@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan

Co-author 3

Chung-You Tsai pgtsai@gmail.com Far Eastern Memorial Hospital Division of Urology, Department of Surgery New Taipei Taiwan

Co-author 4

Shi-Wei Huang will6438.huang@gmail.com National Taiwan University Hospital, Yunlin Branch Department of Urology Yunlin Taiwan

Co-author 5

Co-author 6

Co-author 7

Co-author 8

Co-author 9

Co-author 10

Co-author 11

Co-author 12

Co-author 13

Co-author 14

Co-author 15

Co-author 16

Co-author 17

Co-author 18

Co-author 19

Co-author 20

Abstract Content

Introduction

This study compares four general-purpose large language models (LLMs) in prostate cancer information retrieval (IR) and risk assessment (RA), highlighting performance differences across multifaceted clinical tasks.

Materials and Methods

We compares the performance of four LLMs (ChatGPT-4-turbo, Claude-3-opus, Gemini-pro-1.0, ChatGPT-3.5-turbo) on three RA tasks (LATITUDE, CHAARTED, TwNHI) and seven IR tasks. The study using simulated text reports from computed tomography, magnetic resonance imaging, bone scans, and biopsy pathology on stage IV PC patients. The tasks covered TNM staging, detection and quantification of bone and visceral metastases, offering a broad evaluation of the models' ability to process diverse clinical data. We used zero-shot chain-of-thought prompting via API to query the LLMs with multi-modal reports. The models' performances were assessed through a consensus standard set by three adjudicators, using repeated single-query methods and ensemble voting, and evaluated based on 6 outcome metrics.

Results

In a simulated analysis of 350 Stage IV PC patient reports, 115(32.8%) as LATITUDE high risk, 128(36.5%) as CHAARTED high volume, and 94(27.0%) as TwNHI high risk. Ensemble voting, based on three repeated single-round queries, consistently enhances accuracy with a higher likelihood of achieving non-inferior results compared to a single query. The four language models tested showed small differences in information retrieval (IR) tasks, achieving high accuracy rates (87.4%-94.2%) and consistent TNM staging results (ICC > 0.8). However, notable variations emerged in RA, with performance ranked from highest to lowest: ChatGPT- 4-turbo, Claude-3-opus, Gemini-pro-1.0, and ChatGPT-3.5-turbo. While all models showed similar IR performance, significant differences were observed in RA tasks, with ChatGPT-4-turbo outperforming the others in accuracy (90.1%, 90.7%, 91.6%) and consistency (ICC 0.86, 0.93, 0.76) across three RA tasks.Its high sensitivity and NPV also making it a tool for ruling out high-risk patients.

Conclusions

This study shows ChatGPT-4-turbo is the most effective LLM tested, excelling in RA tasks for Stage IV PC with high accuracy and consistency, highlighting its potential as a clinical decision support tool.

Keywords

Prostate cancer, large language model, risk assessment, information retrieval, clinical decision support, ChatGPT

Figure 1

https://storage.unitedwebnetwork.com/files/1237/77af9645be407bb8225abf07a7c75750.png

Figure 1 Caption

Figure 2

Figure 2 Caption

Figure 3

Figure 3 Caption

Figure 4

Figure 4 Caption

Figure 5

Figure 5 Caption

Presentation Order