Hay nuevos artículos disponibles. Pincha para refrescar la página.

AnteayerTus fuentes RSS

BMJ Open
Quality and efficiency of integrating customised large language model-generated summaries versus physician-written summaries: a validation study
Septiembre 4^th 2025 at 15:50

Quality and efficiency of integrating customised large language model-generated summaries versus physician-written summaries: a validation study

Por: Schoonbeek · R. C. · Workum · J. D. · Schuit · S. C. E. · Hoekman · A. H. · Mehri · T. · Doornberg · J. N. · van der Laan · T. P. · Bootsma-Robroeks · C. M. H. H. T. · On behalf of the Applied Artificial Intelligence in Healthcare Consortium · Aalderink · van den Berg · Be

Objectives

To compare the quality and time efficiency of physician-written summaries with customised large language model (LLM)-generated medical summaries integrated into the electronic health record (EHR) in a non-English clinical environment.

Design

Cross-sectional non-inferiority validation study.

Setting

Tertiary academic hospital.

Participants

52 physicians from 8 specialties at a large Dutch academic hospital participated, either in writing summaries (n=42) or evaluating them (n=10).

Interventions

Physician writers wrote summaries of 50 patient records. LLM-generated summaries were created for the same records using an EHR-integrated LLM. An independent, blinded panel of physician evaluators compared physician-written summaries to LLM-generated summaries.

Primary and secondary outcome measures

Primary outcome measures were completeness, correctness and conciseness (on a 5-point Likert scale). Secondary outcomes were preference and trust, and time to generate either the physician-written or LLM-generated summary.

Results

The completeness and correctness of LLM-generated summaries did not differ significantly from physician-written summaries. However, LLM summaries were less concise (3.0 vs 3.5, p=0.001). Overall evaluation scores were similar (3.4 vs 3.3, p=0.373), with 57% of evaluators preferring LLM-generated summaries. Trust in both summary types was comparable, and interobserver variability showed excellent reliability (intraclass correlation coefficient 0.975). Physicians took an average of 7 min per summary, while LLMs completed the same task in just 15.7 s.

Conclusions

LLM-generated summaries are comparable to physician-written summaries in completeness and correctness, although slightly less concise. With a clear time-saving benefit, LLMs could help reduce clinicians’ administrative burden without compromising summary quality.

🏷️ My labels
- ❌
Etiquetas relacionadas
Septiembre 4^th 2025 at 15:50

Procalcitonin to guide antibiotic use during the first wave of COVID-19 in English and Welsh hospitals: integration and triangulation of findings from quantitative and qualitative sources

Por: Henley · J. · Brookes-Howell · L. · Howard · P. · Powell · N. · Albur · M. · Bond · S. E. · Euden · J. · Dark · P. · Grozeva · D. · Hellyer · T. P. · Hopkins · S. · Llewelyn · M. · Maboshe · W. · McCullagh · I. J. · Ogden · M. · Pallmann · P. · Parsons · H. K. · Partridge · D. G. · Shaw · D

Aim

To integrate the quantitative and qualitative data collected as part of the PEACH (Procalcitonin: Evaluation of Antibiotic use in COVID-19 Hospitalised patients) study, which evaluated whether procalcitonin (PCT) testing should be used to guide antibiotic prescribing and safely reduce antibiotic use among patients admitted to acute UK National Health Service (NHS) hospitals.

Design

Triangulation to integrate quantitative and qualitative data.

Setting and participants

Four data sources in 148 NHS hospitals in England and Wales including data from 6089 patients.

Method

A triangulation protocol was used to integrate three quantitative data sources (survey, organisation-level data and patient-level data: data sources 1, 2 and 3) and one qualitative data source (clinician interviews: data source 4) collected as part of the PEACH study. Analysis of data sources initially took place independently, and then, key findings for each data source were added to a matrix. A series of interactive discussion meetings took place with quantitative, qualitative and clinical researchers, together with patient and public involvement (PPI) representatives, to group the key findings and produce seven statements relating to the study objectives. Each statement and the key findings related to that statement were considered alongside an assessment of whether there was agreement, partial agreement, dissonance or silence across all four data sources (convergence coding). The matrix was then interpreted to produce a narrative for each statement.

Objective

To explore whether PCT testing safely reduced antibiotic use during the first wave of the COVID-19 pandemic.

Results

Seven statements were produced relating to the PEACH study objective. There was agreement across all four data sources for our first key statement, ‘During the first wave of the pandemic (01/02/2020-30/06/2020), PCT testing reduced antibiotic prescribing’. The second statement was related to this key statement, ‘During the first wave of the pandemic (01/02/2020-30/06/2020), PCT testing safely reduced antibiotic prescribing’. Partial agreement was found between data sources 3 (quantitative patient-level data) and 4 (qualitative clinician interviews). There were no data regarding safety from data sources 1 or 2 (quantitative survey and organisational-level data) to contribute to this statement. For statements three and four, ‘PCT was not used as a central factor influencing antibiotic prescribing’, and ‘PCT testing reduced antibiotic prescribing in the emergency department (ED)/acute medical unit (AMU),’ there was agreement between data source 2 (organisational-level data) and data source 4 (interviews with clinicians). The remaining two data sources (survey and patient-level data) contributed no data on this statement. For statement five, ‘PCT testing reduced antibiotic prescribing in the intensive care unit (ICU)’, there was disagreement between data sources 2 and 3 (organisational-level data and patient-level data) and data source 4 (clinician interviews). Data source 1 (survey) did not provide data on this statement. We therefore assigned dissonance to this statement. For statement six, ‘There were many barriers to implementing PCT testing during the first wave of COVID-19’, there was partial agreement between data source 1 (survey) and data source 4 (clinician interviews) and no data provided by the two remaining data sources (organisational-level data and patient-level data). For statement seven, ‘Local PCT guidelines/protocols were perceived to be valuable’, only data source 4 (clinician interviews) provided data. The clinicians expressed that guidelines were valuable, but as there was no data from the other three data sources, we assigned silence to this statement.

Conclusion

There was agreement between all four data sources on our key finding ‘during the first wave of the pandemic (01/02/2020-30/06/2020), PCT testing reduced antibiotic prescribing’. Data, methodological and investigator triangulation, and a transparent triangulation protocol give validity to this finding.

Trial registration number

ISRCTN66682918.

EARLYBIRD: catching the earliest changes of the bone and intervertebral discs in children at increased risk for scoliosis development with MRI - study protocol of a prospective observational cohort study

Por: Lafranca · P. P. G. · Stempels · H. W. · de Reuver · S. · Houben · M. L. · Kok · J. · Kruyt · M. C. · Castelein · R. M. · Seevinck · P. R. · van der Velden · T. · Shcherbakova · Y. M. · Ito · K. · Schlösser · T. P. C.

Introduction

Adolescent idiopathic scoliosis (AIS) is an acquired deformity that develops in 2–4% of otherwise healthy children during adolescent growth, substantially reducing their quality of life and creating a life-long burden of disease. Despite many years of dedicated research, the cause and mechanism of AIS are still unknown and no effective curative treatments are available for children suffering from this spinal and chest deformity. To date, all etiological studies focused on children with an already established scoliosis. EARLYBIRD aims to uncover the earliest pathoanatomical changes in AIS, by studying longitudinal spinal growth in children at increased risk for scoliosis development with MRI, starting before adolescence.

Methods and analysis

This prospective observational cohort study will follow two groups: 60 adolescent girls (8–10 years old) who have an older sibling or parent diagnosed with AIS (cohort 1) and 60 adolescents with 22q11.2 deletion syndrome, a genetic microdeletion associated with 50% scoliosis prevalence (cohort 2). Data collection will be completely radiation-free and occur at baseline and yearly during adolescence up to 15 years of age in girls and up to 16 in boys. A comprehensive physical examination, a dedicated spine and chest MRI as well as a standing three-dimensional (3-D) spinal ultrasound will be obtained at each time point. The main parameter will be the longitudinal changes in segmental axial rotation during growth in subjects that do and do not develop AIS. Secondary endpoints are longitudinal changes in 3-D morphology of the bone and intervertebral discs (IVDs) during normal spinal development and during scoliosis development, determining biomarkers for bone growth, implementing radiation-free imaging methods for spinal monitoring in adolescent patients at risk for scoliosis development and use these for spinal skeletal maturity and patient-specific spinal biomechanical analyses.

Ethics and dissemination

This protocol has been approved by the Medical Ethics Committee NedMed and is registered on clinicaltrials.gov (NCT05924347). Written informed consent will be obtained from all parents/legal representatives. Key findings will be disseminated via peer-reviewed journals and presentation at conferences. This study is funded by the European Research Council.

🏷️ My labels
- ❌
Etiquetas relacionadas
Junio 27^th 2025 at 02:27

BMJ Open
Comparison of non-invasive and fluorescein tear film break-up time in a 65-year-old Norwegian population: a cross-sectional study
Abril 11^th 2025 at 03:59

Comparison of non-invasive and fluorescein tear film break-up time in a 65-year-old Norwegian population: a cross-sectional study

Por: Tashbayev · B. · Badian · R. A. · Chen · X. · Vitelli · V. · Lagali · N. · Dartt · D. · Hove · L. H. · Jensen · J. L. · Utheim · T. P.

Objectives

Measurement of tear film stability is central in dry eye disease (DED) diagnosis. In this study, we aimed to compare the performance of two methods of tear film stability measurement: non-invasive tear break-up time (NIBUT) and fluorescein tear film break-up time (FTBUT).

Design

Cross-sectional study.

Setting and participants

The study involved 132 subjects of 65-year-old inhabitants of the Oslo region who were not seeking ophthalmic care.

Interventions

The participants underwent a battery of DED tests, including NIBUT measured on Oculus Keratograph 5M and a traditional method using fluorescein drops (FTBUT). Oculus Keratograph 5M measures two types of NIBUT:; appearance time of the first dry spot (NIBUT_First) and average NIBUT_Avg.

Results

74 participants (56%) were female and 58 were male (44%). Subjects presented with varying degrees of DED signs and symptoms. Mean values of NIBUT_First and FTBUT from all the participants were significantly different (6.2±4.9 s vs 8.6±6.2 s, pFirst and NIBUT_Avg values (6.2±4.9 s vs 8.3±5.5 s, pAvg values (8.6±6.2 s vs 8.3±5.5 s, p=0.655). The receiver operating characteristic curve analysis was performed to compare NIBUT and FTBUT in regards to other clinical tests (Ocular Surface Disease Index, ocular surface staining, blink interval, eye redness, corneal sensitivity, lid debris, Schirmer I test, tear osmolarity, meibum quality, meibum expressibility, lid hyperemia, tear meniscus height. irregular lid margin, conjunctival hyperaemia, margin telangiectasia, lipid layer and meibomian gland drop-out). While FTBUT demonstrated results with area under the curve>0.6, neither NIBUT_First nor NIBUT_Avg showed significant results.

Conclusion

NIBUT_First was shorter than FTBUT. Low correlation between NIBUT and FTBUT indicates that these diagnostic tests are not interchangeable. Other DED tests had correlation, though low, while NIBUT did not demonstrate correlation.

🏷️ My labels
- ❌
Etiquetas relacionadas
Abril 11^th 2025 at 03:59

FreshRSS

Quality and efficiency of integrating customised large language model-generated summaries versus physician-written summaries: a validation study

Procalcitonin to guide antibiotic use during the first wave of COVID-19 in English and Welsh hospitals: integration and triangulation of findings from quantitative and qualitative sources

EARLYBIRD: catching the earliest changes of the bone and intervertebral discs in children at increased risk for scoliosis development with MRI - study protocol of a prospective observational cohort study

Comparison of non-invasive and fluorescein tear film break-up time in a 65-year-old Norwegian population: a cross-sectional study