Metabolomics Analysis-Based Machine Learning for Endometrial Cancer Diagnosis: Integration of Biomarker Discovery and Explainable Artificial Intelligence
1Department of Biostatistics and Medical Informatics, Inonu University Faculty of Medicine, Malatya, Türkiye
2Rectorate Unit, Adıyaman University, Adıyaman, Türkiye
J Clin Pract Res 2025; 47(5): 503-511 DOI: 10.14744/cpr.2025.58891
Full Text PDF

Abstract

Objective: Endometrial cancer (EC) is the most frequent gynecological malignancy in women worldwide. This study aims to develop a predictive model integrating machine learning (ML) approaches with explainable artificial intelligence (XAI) using metabolomics panel data for significant biomarker discovery in EC.
Materials and Methods: This study applied metabolomics and XAI to uncover diagnostic biomarkers for EC, the most common gynecologic malignancy. A total of 191 EC cases and 204 controls were analyzed using mass spectrometry. ML and XAI techniques were incorporated, including SHapley Additive exPlanation, Random Forest, BaggedCART, LightGBM, Adaptive Boosting, and Extreme Gradient Boosting.
Results: Statistically significant differences (adjusted p<0.05) were found in 25 metabolites. Effect sizes (ES) of m/z=219.125 (ES=1.516), m/z=672.6961 (ES=0.913), and m/z=203.1564 (ES=0.839) were notably large, suggesting strong discriminatory ability. These metabolites are involved in lipid dysregulation, steroid hormone pathways, and oxidative stress, reflecting cancer-specific metabolic reprogramming. The ML models, particularly LightGBM, demonstrated high accuracy and good calibration. After training with the final feature dataset, SHapley Additive exPlanations (SHAP) analysis identified m/z=219.125, m/z=672.6961, and m/z=127.0769 as the top contributing features, aligning with their biological impact on EC pathogenesis.
Conclusion: This study suggests non-invasive biomarkers for early detection of EC screening, highlighting the heterogeneity of metabolic adaptation in EC and the need for multi-omics approaches to understand disease mechanisms. Limitations include diverse cohorts and reliance on tandem mass spectrometry. Nonetheless, these findings represent a step forward in precision oncology.