Enhancing semantic segmentation: architectural innovations and strategies for label-efficient learning

Suresh, Tharrengini

View/Open

SureshT2025m-2b.pdf (12.75Mb)

Date

2025

Author

Suresh, Tharrengini

Metadata

Show full item record

Abstract

Semantic segmentation is a fundamental component of modern computer vision applications. Although supervised learning models have achieved state-of-the-art performance in this domain, they rely heavily on large volumes of labeled data, which is an expensive and time-consuming requirement. Thus, this research aims to develop enhanced supervised semantic segmentation models that balance accuracy and data efficiency for visual perception tasks in autonomous driving environments. To achieve this, the thesis is organized into two distinct phases. The first phase investigates a dual-network architecture, in which an auxiliary boundary detection network is incorporated into the primary segmentation framework to mitigate pixelation artifacts at object boundaries in multiclass segmentation of complex scenes. The experimental findings demonstrate the importance of designing unified segmentation models that take advantage of architectural enhancements capable of extracting richer feature representations for improved performance. The second phase leverages insights from the previous stage and focuses on the development of an efficient deep learning model with attention mechanisms and multi-scale feature refinement. The proposed method introduces a novel depth-wise, point-wise feature pyramid module that extracts information-rich spatio-semantic context from early and deep feature representations, improving model efficacy. Exhaustive experimental studies conducted on widely used benchmark datasets validate the effectiveness of the proposed models, which achieve competitive performance while offering improved computational efficiency relative to baseline approaches. The findings highlight that strategically balancing resource utilization with architectural innovation can yield strong performance while minimizing annotation demands and environmental impact. This research sets a valuable precedent for building competitive, resource-aware vision systems suited to constrained application settings.

URI

https://knowledgecommons.lakeheadu.ca/handle/2453/5503

Collections

Electronic Theses and Dissertations from 2009 [1738]