| dc.description.abstract | Agricultural image semantic segmentation plays a vital role in precision agriculture, enabling
accurate analysis of visual data to enhance crop management and optimize resource use. However,
achieving high segmentation accuracy while maintaining computational efficiency remains
challenging, particularly for real-time systems and edge devices. This thesis presents a two-phase
research effort toward an efficient and scalable segmentation framework for high-resolution
agricultural imagery. In the first phase, an effective model was developed using a novel Dual
Atrous Separable Convolution (DAS-Conv) module integrated into a DeepLabV3 backbone. The
DAS-Conv module optimizes the balance between dilation rates and padding size to enhance
contextual representation without extra computational cost, while a skip connection between
encoder and decoder stages improves fine-grained feature recovery. Despite its lightweight
design, the model achieved strong results on the Agriculture-Vision benchmark, demonstrating
over 66% higher efficiency compared to transformer-based state-of-the-art models. In the second
phase, the framework was extended to DAS-SK, which integrates Selective Kernel (SK) attention
into the DAS-Conv module to strengthen multi-scale feature learning and adaptability. The
enhanced Atrous Spatial Pyramid Pooling (ASPP) module captures both fine local structures and
global context, while a dual-backbone design (MobileNetV3-Large and EfficientNet-B3) further
improves representation and scalability. Across three benchmark datasets—LandCover.ai, VDD,
and PhenoBench—DAS-SK consistently demonstrates superior efficiency–accuracy trade-offs and
notable improvements over its predecessor, DAS. On LandCover.ai, DAS-SK achieves 86.25%
mIoU, surpassing DAS by +3.17%, using 10.68M parameters and 11.25 GFLOPs. Although
it employs slightly more parameters than DAS, the model remains far lighter than hybrid systems such as Ensemble UNet and transformer models like SegFormer MiT-B2. DAS-SK
also achieves higher overall efficiency compared with DAS, demonstrating that the added SK
attention and dual-backbone design translate directly into improved segmentation quality. A
similar trend is observed on the VDD dataset. DAS-SK attains 79.45% mIoU, improving on
DAS by +2.25%, while operating with 10.68M parameters and 43.52 GFLOPs. Although the
smaller DAS backbone enables slightly higher FPS, DAS-SK delivers the best accuracy–efficiency
balance overall, achieving the highest efficiency score of 9.12%, outperforming transformer
models whose parameter counts range from 27M to 234M. On the PhenoBench dataset,
DAS-SK again provides the highest performance, reaching 85.55% mIoU compared to DAS at
82.53% (+3.02% improvement). The computational cost remains moderate at 45.00 GFLOPs,
versus 25.23 GFLOPs for DAS, but the efficiency gain is substantial—10.09% for DAS-SK
versus 7.43% for DAS—highlighting the benefits of improved feature selection and multi-scale
fusion. Despite introducing a modest computational increase (typically 1.8× more GFLOPs),
DAS-SK consistently delivers 2–3% higher mIoU and markedly stronger multi-scale feature
discrimination than its ancestor DAS. Combined with parameter counts that remain significantly
below those of modern transformer and hybrid models, DAS-SK offers a practical, lightweight,
and high-performing solution for real-time agricultural monitoring and remote sensing in
resource-limited environments, where accuracy, efficiency, and scalability are equally critical. | en_US |