Frequency-Based Deep Occlusion Awareness Instance Segmentation
Küçük Resim Yok
Tarih
2026
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Mdpi
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
One major challenge faced by deep learning-based methods that detect target objects in the form of bounding boxes is object occlusion. High degrees of occlusion significantly diminish the accuracy of instance segmentation. Nonetheless, complex-valued Fourier descriptors can robustly represent object boundaries using minimal information. In this study, the impact of integrating Fourier descriptors-renowned for their strong representational capacity-with deep network models (UNet) that exhibit high generalization performance on instance segmentation accuracy was investigated. Within the scope of the research, nine network models were designed based on different strategies for utilizing frequency components. These variants fall into four strategy families: (i) UNet-style spectrum regression on fixed low-frequency windows (FUNet), (ii) magnitude-guided frequency selection/ROI construction (FUNet-Thr, FUNet-BBox), (iii) sequence models over tokenized FFT coefficients (BiLSTM Patch/Sorted), and (iv) encoder-only spectrum predictors with different depth/capacity (EncoderFFT1/2). To fairly evaluate the models' performance in segmenting objects subjected to disruptive factors (e.g., occlusion, blurring, noise), a specialized synthetic dataset was prepared. The task is formulated as single-target (single-instance), single-class segmentation. This dataset, automatically generated according to initial parameter values, contains images of objects moving at various speeds within a single frame. Among these models, the one termed FUNet, which relies on partial matching of central frequency components, achieved the highest segmentation accuracy despite the disruptive effects. Under the challenging Dataset 8 setting, the proposed FUNet achieved the highest overlap-based performance (Dice = 0.9329, IoU = 0.8842) among Attention U-Net, U-Net, and FourierNet, with statistically significant gains confirmed by paired per-image tests.
Açıklama
Anahtar Kelimeler
frequency domain, Fourier transform, deep learning, segmentation
Kaynak
Mathematics
WoS Q Değeri
Q1
Scopus Q Değeri
Q1
Cilt
14
Sayı
5











