To develop and validate a polycystic ovary syndrome (PCOS) case definition using administrative health data sources.
A validation study.
Secondary care centre outpatient gynaecology clinic in Calgary, Alberta, Canada.
3951 electronic health records of women aged 18–45 years who presented to a gynaecology clinic in Calgary, Canada, between January 2014 and December 2019 were reviewed. We identified 180 patients with PCOS using the Rotterdam criteria. Participants were excluded if they were biologically male, pregnant at the time of the consultation, did not meet the date criteria or if their consultation note was missing. The chart data were connected to the Practitioner Claims and the Discharge Abstract Database by personal health number.
Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 68 case definitions for PCOS were estimated. Case definition performance was graded.
Of the 68 case definitions tested, none had high validity. The best performing case definitions were: (1) ≥3 instances of International Classification of Diseases-9 code 256.4 (polycystic ovaries) with exclusion codes (sensitivity 23.89%, specificity 99.59%, PPV 74.14%, NPV 96.35%) and (2) 626.X (irregular menstruation), 704.1 (hirsutism) and ≥3 instances of code 256.4 with exclusion codes (sensitivity 2.78%, specificity 99.97%, PPV 83.33%, NPV 95.40%).
We identified several case definitions for PCOS of moderate validity with high PPV (>70%) for case ascertainment in PCOS research in jurisdictions with similar administrative health data. These case definitions are limited by low sensitivity, which should be considered when interpreting research findings.