Uncertainty-guided Transformer learning for trustworthy medical image classification
Abstract
Reliable medical image classification is fundamental for the safe use of deep learning in clinical
decision support. The state-of-the-art deep learning models, such as medical vision Transformers
performs well in medical image segmentation. These models often present unreliable probability
estimates and do not have built-in ways to explicitly handle uncertainty or interpretability.
These issues become especially problematic when inputs are ambiguous or datasets are not
uniformly distributed, which are common in real-world clinical settings.
This study contributes in extending the architecture of Medical Transformer (MedFormer),
a hierarchical medical vision Transformer guided by uncertainty and prototypes, to improve
trustworthiness without reducing feature representation. The model uses per-token evidential
uncertainty estimation via a Dirichlet approach, enabling explicit measurement of uncertainty
and spatial localization. Instead of just acting as a post-hoc diagnostic tool, uncertainty actively
guides feature routing and refinement during training, decreasing unreliable updates in uncertain
regions. Additionally, prototype-based learning is incorporated to maintain a structured, classspecific
geometry in the embedding space and support similarity-based, interpretable decisions
grounded in visual patterns.
The proposed model has been tested on various medical imaging types, including mammography,
breast ultrasound, brain tumor MRI, and breast histopathology, providing a thorough testing
across different dataset contexts. Experiments show that, while classification accuracy improvements
vary across datasets, the method reliably improves calibration, reduces overconfidence,
and enhances selective prediction compared to the baseline MedFormer. These results indicate
that integrating uncertainty estimation and prototype-based regularisation into Transformerbased
representation learning can greatly boost the reliability and explainability of medical
image classifiers, supporting the development of trustworthy AI systems for clinical use
Experimental results show that the proposed model improves calibration, reduces overconfidence,
and enhances selective prediction across all evaluated datasets compared to the baseline
MedFormer.The accuracies are reported in the selected benchmark datasets, with larger improvements
in modalities with clearer visual cues and more modest changes in mammography
due to inherent ambiguity. Overall, uncertainty-guided routing and prototype-based learning
improve trustworthiness without sacrificing discriminative performance.
Description
Thesis embargoed until March 27 2027.
