Findings demonstrate the utility of machine learning in aiding clinical risk stratification within a complex patient cohort.
Unfortunately, a substantial portion of pregnancies in lupus patients are complicated by an adverse pregnancy outcome (APO). This can include preterm delivery, intrauterine growth restriction and foetal mortality. Given the high prevalence of APOs in this group, there has been considerable interest in predicting those at the greatest risk of negative outcomes, to permit enhanced observation and intervention in these patients. The EUREKA algorithm was developed to predict obstetric risk in patients with different subsets of antiphospholipid antibodies and generated significant discourse.1 More recently, machine learning (ML) methodology has been applied by Fazzari et al to a large observational cohort (PROMISSE) to identify additional predictors of APO.2
The PROMISSE cohort enrolled 385 pregnant women with mild to moderate SLE both with and without antiphospholipid antibody positivity. They collected data on pregnancy outcomes between 2003 and 2013 from 9 North American sites. Exclusion criteria included a daily prednisone dose >20mg, a urinary PCR >1000, serum creatinine >1.2 mg/dL, type 1 or 2 diabetes mellitus, or systemic hypertension.
Previous work in this cohort has linked increased levels of the complement activation products Bb and sC5b-9 to higher rates of APOs.3 More recently, Fazzari et al have applied several ML approaches to the PROMISSE cohort and compared these to logistic regression modelling to identify predictors of APO in SLE patients.2
Approaches were trialled including least absolute shrinkage and selection operator (LASSO), random forest, neural network, support vector machines (SVM-RBF) gradient boosting, and SuperLearner. These were compared via area under the receiver operating characteristic (AUROC). Forty-one predictors assessed during routine care of patients with SLE were used to build these models.
Fazzari et al identified several risk factors for APO including high disease activity, lupus anticoagulant positivity, thrombocytopenia, and antihypertensive use. When comparing AUROC, the SuperLearner package had the numerically superior area under the curve (AUC) (0.78). However, this was not significantly different to LASSO, SVM-RBF, or random forest (AUC 0.77 in all cases).
Weaknesses of the PROMISSE cohort are its exclusion of high disease activity SLE patients and those with a systemic blood pressure of > 140/90mmHg. Additionally, the proportion of patients with APOs within the cohort was low (18.4%), likely in part related to the stringent exclusion criteria. A recent retrospective Portuguese study of which did not exclude high disease activity of lupus nephritis patients identified a far higher rate of APO (41.4%) in their SLE cohort.4 Indeed, Ntali et al recently demonstrated reduced APOs in SLE patients with low disease activity in a prospective observational study.5 Application of these models in a higher disease burden cohort would therefore be desirable.
This work demonstrates the utility of ML in aiding clinical risk stratification within a complex patient cohort. The utilization of standard clinical variables and comparison of several ML techniques are substantial strengths of this work. However, further validation in external cohorts is desirable. The application of ML methodology in risk stratification within SLE may provide better clarity in a heterogeneous patient cohort. Additionally, in the future, similar methodological approaches could be trialled across the autoimmune connective tissue disease spectrum to provide better prognostic information to patients at diagnosis, irrespective of their diagnostic label.