GraphSAGE-based approach for age-specific multi-omics biomarker identification in bladder cancer
Abstract
Bladder cancer is a highly prevalent malignancy with substantial morbidity and mor-
tality, emphasizing the urgent need for early detection and personalized treatment
strategies. Although recent advances in cancer genomics have enhanced our under-
standing of tumor biology, the role of age-related genomic variations in bladder cancer
progression remains largely unexplored. In this study, we present a novel framework
that combines multi-omics data integration with Graph Neural Networks (GNNs)
to identify age-specific biomarkers associated with bladder cancer prognosis. We in-
tegrate copy number alterations (CNA), DNA methylation, and mRNA expression
profiles into graph-based representations, where nodes denote genomic features and
edges encode molecular interactions. Unlike conventional statistical or machine learn-
ing approaches, our method incorporates age both as a stratification factor and as a
graph-level feature, enabling the model to learn distinct molecular signatures across
different patient age groups. Using survival outcomes, we determined 64 years as the
optimal threshold for age stratification, revealing significant differences in mortality
between patients aged ≤64 years (30.46%) and those > 64 years (51.74%), thereby
highlighting the prognostic value of age in bladder cancer. To enhance model in-
terpretability and performance, we implemented a robust feature selection pipeline
involving variance thresholding, ANOVA F-scores, L1 regularization, and Recursive
Feature Elimination with Cross-Validation (RFECV). Among several models tested,
GraphSAGE consistently achieved the highest accuracy, F1-score, and AUC, demon-
strating the effectiveness of graph-based learning in capturing complex biological re-
lationships. Furthermore, SHAP (SHapley Additive exPlanations) analysis revealed
key age-associated biomarkers such as SNRPN, LINC01091, and DHX36, which are
strongly implicated in patient survival and may inform future therapeutic target-
ing. This study introduces a comprehensive, age-aware graph learning framework for
biomarker discovery in bladder cancer, offering a powerful tool for advancing per-
sonalized diagnosis, prognosis, and treatment planning. Beyond bladder cancer, this
methodology has the potential to be generalized to other cancer types where age sig-
nificantly influences disease trajectory, thereby contributing to the broader field of
precision oncology. By bridging age-specific genomic variation with multi-modal data
and explainable machine learning, our approach opens new avenues for developing
clinically actionable insights and enhancing patient-specific management strategies in
oncology.