Causal discovery and treatment effect modeling in breast cancer
| dc.contributor.advisor | Alkhateeb, Abedalrhman | |
| dc.contributor.author | Krikun, Elena | |
| dc.contributor.committeemember | Bin Ahmed, Saad | |
| dc.contributor.committeemember | Yaseen, Maysa | |
| dc.date.accessioned | 2026-05-12T12:55:54Z | |
| dc.date.created | 2026 | |
| dc.date.issued | 2026 | |
| dc.description | Thesis is embargoed until May 15 2027. | |
| dc.description.abstract | Modeling breast cancer outcomes remains challenging because of extreme molecular heterogeneity and the inability of associative models, including those developed through traditional machine learning, to support counterfactual, intervention-based clinical reasoning. Building on recent advances in causal feature selection, multiomics variable selection, and individual treatment effect estimation, this thesis proposes a hybrid pipeline within a unified computational multiomics framework that integrates high-dimensional data with causal modeling to produce interpretable precision oncology models that extend beyond risk prediction. The proposed pipeline was developed using the TCGA-BRCA cohort as the discovery set and validated on the independent retrospective METABRIC cohort to assess transportability. To address the curse of dimensionality, the framework applies Markov Blanket-based local causal discovery across seven data modalities and reduces more than 600,000 initial features to a sparse and stable causal core. This causal representation is then used for survival modeling (C-index = 0.8085, 5-year AUC = 0.8676) and individual treatment effect (ITE) estimation for chemotherapy, hormone therapy, and targeted therapy. External validation on METABRIC achieved a C-index of 0.7200 and a 5-year AUC of 0.7639, indicating moderate but clear transportability across cohorts and assay platforms. The final causal core confirmed the integration of clinical, proteomic, and epigenetic signals, and identified a long non-coding RNA as a structurally relevant driver. The treatment-effect stage used treatment-specific arm definitions reconstructed from clinical records together with a robustness-oriented validation protocol. Chemotherapy showed the strongest and most stable beneficial treatment effect, most notably in the TNBC subgroup, where treatment-effect estimates remained consistently protective across estimators and overlap-adjusted variants. Hormone-therapy estimates showed a consistently protective direction in receptor-positive subgroup analyses, although the magnitude of the effect was attenuated under stricter overlap control, indicating residual confounding and limited positivity in the observational setting. Targeted therapy also showed a protective direction under most evaluated techniques, but given the very small number of treated patients and partial estimator disagreement, these effect estimates should be interpreted as exploratory. | |
| dc.identifier.uri | https://knowledgecommons.lakeheadu.ca/handle/2453/5606 | |
| dc.language.iso | en | |
| dc.title | Causal discovery and treatment effect modeling in breast cancer | |
| dc.type | Thesis | |
| etd.degree.discipline | Computer Science | |
| etd.degree.grantor | Lakehead University | |
| etd.degree.level | Master | |
| etd.degree.name | Master of Computer Science |
Files
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 2.23 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
