Causal discovery and treatment effect modeling in breast cancer

dc.contributor.advisorAlkhateeb, Abedalrhman
dc.contributor.authorKrikun, Elena
dc.contributor.committeememberBin Ahmed, Saad
dc.contributor.committeememberYaseen, Maysa
dc.date.accessioned2026-05-12T12:55:54Z
dc.date.created2026
dc.date.issued2026
dc.descriptionThesis is embargoed until May 15 2027.
dc.description.abstractModeling breast cancer outcomes remains challenging because of extreme molecular heterogeneity and the inability of associative models, including those developed through traditional machine learning, to support counterfactual, intervention-based clinical reasoning. Building on recent advances in causal feature selection, multiomics variable selection, and individual treatment effect estimation, this thesis proposes a hybrid pipeline within a unified computational multiomics framework that integrates high-dimensional data with causal modeling to produce interpretable precision oncology models that extend beyond risk prediction. The proposed pipeline was developed using the TCGA-BRCA cohort as the discovery set and validated on the independent retrospective METABRIC cohort to assess transportability. To address the curse of dimensionality, the framework applies Markov Blanket-based local causal discovery across seven data modalities and reduces more than 600,000 initial features to a sparse and stable causal core. This causal representation is then used for survival modeling (C-index = 0.8085, 5-year AUC = 0.8676) and individual treatment effect (ITE) estimation for chemotherapy, hormone therapy, and targeted therapy. External validation on METABRIC achieved a C-index of 0.7200 and a 5-year AUC of 0.7639, indicating moderate but clear transportability across cohorts and assay platforms. The final causal core confirmed the integration of clinical, proteomic, and epigenetic signals, and identified a long non-coding RNA as a structurally relevant driver. The treatment-effect stage used treatment-specific arm definitions reconstructed from clinical records together with a robustness-oriented validation protocol. Chemotherapy showed the strongest and most stable beneficial treatment effect, most notably in the TNBC subgroup, where treatment-effect estimates remained consistently protective across estimators and overlap-adjusted variants. Hormone-therapy estimates showed a consistently protective direction in receptor-positive subgroup analyses, although the magnitude of the effect was attenuated under stricter overlap control, indicating residual confounding and limited positivity in the observational setting. Targeted therapy also showed a protective direction under most evaluated techniques, but given the very small number of treated patients and partial estimator disagreement, these effect estimates should be interpreted as exploratory.
dc.identifier.urihttps://knowledgecommons.lakeheadu.ca/handle/2453/5606
dc.language.isoen
dc.titleCausal discovery and treatment effect modeling in breast cancer
dc.typeThesis
etd.degree.disciplineComputer Science
etd.degree.grantorLakehead University
etd.degree.levelMaster
etd.degree.nameMaster of Computer Science

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.23 KB
Format:
Item-specific license agreed upon to submission
Description: