Frequency-Based Deep Occlusion Awareness Instance Segmentation

Guzel, Yasin; Aydin, Zafer; Talu, Muhammed Fatih

Frequency-Based Deep Occlusion Awareness Instance Segmentation

dc.contributor.author	Guzel, Yasin
dc.contributor.author	Aydin, Zafer
dc.contributor.author	Talu, Muhammed Fatih
dc.date.accessioned	2026-04-04T13:31:00Z
dc.date.available	2026-04-04T13:31:00Z
dc.date.issued	2026
dc.department	İnönü Üniversitesi
dc.description.abstract	One major challenge faced by deep learning-based methods that detect target objects in the form of bounding boxes is object occlusion. High degrees of occlusion significantly diminish the accuracy of instance segmentation. Nonetheless, complex-valued Fourier descriptors can robustly represent object boundaries using minimal information. In this study, the impact of integrating Fourier descriptors-renowned for their strong representational capacity-with deep network models (UNet) that exhibit high generalization performance on instance segmentation accuracy was investigated. Within the scope of the research, nine network models were designed based on different strategies for utilizing frequency components. These variants fall into four strategy families: (i) UNet-style spectrum regression on fixed low-frequency windows (FUNet), (ii) magnitude-guided frequency selection/ROI construction (FUNet-Thr, FUNet-BBox), (iii) sequence models over tokenized FFT coefficients (BiLSTM Patch/Sorted), and (iv) encoder-only spectrum predictors with different depth/capacity (EncoderFFT1/2). To fairly evaluate the models' performance in segmenting objects subjected to disruptive factors (e.g., occlusion, blurring, noise), a specialized synthetic dataset was prepared. The task is formulated as single-target (single-instance), single-class segmentation. This dataset, automatically generated according to initial parameter values, contains images of objects moving at various speeds within a single frame. Among these models, the one termed FUNet, which relies on partial matching of central frequency components, achieved the highest segmentation accuracy despite the disruptive effects. Under the challenging Dataset 8 setting, the proposed FUNet achieved the highest overlap-based performance (Dice = 0.9329, IoU = 0.8842) among Attention U-Net, U-Net, and FourierNet, with statistically significant gains confirmed by paired per-image tests.
dc.identifier.doi	10.3390/math14050792
dc.identifier.issn	2227-7390
dc.identifier.issue	5
dc.identifier.scopus	2-s2.0-105032775828
dc.identifier.scopusquality	Q1
dc.identifier.uri	https://doi.org/10.3390/math14050792
dc.identifier.uri	https://hdl.handle.net/11616/108519
dc.identifier.volume	14
dc.identifier.wos	WOS:001713739000001
dc.identifier.wosquality	Q1
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Mdpi
dc.relation.ispartof	Mathematics
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_WOS_20250329
dc.subject	frequency domain
dc.subject	Fourier transform
dc.subject	deep learning
dc.subject	segmentation
dc.title	Frequency-Based Deep Occlusion Awareness Instance Segmentation
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Frequency-Based Deep Occlusion Awareness Instance Segmentation

Dosyalar

Koleksiyon