Development and Evaluation of a Generative AI Chatbot for Database Searching in Systematic Review

Por: Wai San Wilson Tam · Neo Tung · Shi Xuan Lee · Gregor Štiglic · Tom Huynh · Arthur Tang

ABSTRACT

Introduction

Systematic reviews (SRs) require comprehensive, reproducible searches, yet developing search strategies is resource-intensive and demands specialized expertise. Generative AI offers potential to streamline this process, but empirical evaluations for GAI-assisted SR searching remain scarce. The objectives of this study are to: demonstrate a step-by-step process for developing a custom ChatGPT-based chatbot to support SR search strategy development, and evaluate its performance.

Design

A cross-sectional evaluation study.

Methods

We used ChatGPT-4.0 to create a chatbot designed to mimic a medical librarian, generating PICO-informed searches. Its knowledge base was augmented with two methodological references. After piloting testing, we refined its instructions. For evaluation, we randomly sampled 50 Cochrane SRs published in 2024. Standardized P–I–O prompts produced database-ready queries for PUBMED and EMBASE. The primary outcome was per-review success rate, summarized by median and inter-quartile range. A sensitivity analysis was conducted.

Results

Pilot testing achieved a retrieval rate of 41/49 (83.7%). In the main sample (1169 studies; median 13.5 studies per SR), the chatbot identified a median of 67.4% of included studies (IQR: 43.1%–88.4%). When limited to indexed studies (n = 1114), retrieval rose to 72.0% (IQR: 46.0%–92.5%). Lower performance was observed when outcomes were absent from the abstracts or interventions had many lexical variants.

Conclusions

A GAI-based chatbot can rapidly generate SR searches (~67%–72% identification), serving as a useful starting point but not a replacement for expert-led approaches. Integration of librarian expertise, structured prompts, and controlled vocabularies may improve performance. Further benchmarking and transparent reporting are needed to guide adoption.

FreshRSS