Frequency-Based Deep Occlusion Awareness Instance Segmentation

dc.contributor.authorGuzel, Yasin
dc.contributor.authorAydin, Zafer
dc.contributor.authorTalu, Muhammed Fatih
dc.date.accessioned2026-04-04T13:31:00Z
dc.date.available2026-04-04T13:31:00Z
dc.date.issued2026
dc.departmentİnönü Üniversitesi
dc.description.abstractOne major challenge faced by deep learning-based methods that detect target objects in the form of bounding boxes is object occlusion. High degrees of occlusion significantly diminish the accuracy of instance segmentation. Nonetheless, complex-valued Fourier descriptors can robustly represent object boundaries using minimal information. In this study, the impact of integrating Fourier descriptors-renowned for their strong representational capacity-with deep network models (UNet) that exhibit high generalization performance on instance segmentation accuracy was investigated. Within the scope of the research, nine network models were designed based on different strategies for utilizing frequency components. These variants fall into four strategy families: (i) UNet-style spectrum regression on fixed low-frequency windows (FUNet), (ii) magnitude-guided frequency selection/ROI construction (FUNet-Thr, FUNet-BBox), (iii) sequence models over tokenized FFT coefficients (BiLSTM Patch/Sorted), and (iv) encoder-only spectrum predictors with different depth/capacity (EncoderFFT1/2). To fairly evaluate the models' performance in segmenting objects subjected to disruptive factors (e.g., occlusion, blurring, noise), a specialized synthetic dataset was prepared. The task is formulated as single-target (single-instance), single-class segmentation. This dataset, automatically generated according to initial parameter values, contains images of objects moving at various speeds within a single frame. Among these models, the one termed FUNet, which relies on partial matching of central frequency components, achieved the highest segmentation accuracy despite the disruptive effects. Under the challenging Dataset 8 setting, the proposed FUNet achieved the highest overlap-based performance (Dice = 0.9329, IoU = 0.8842) among Attention U-Net, U-Net, and FourierNet, with statistically significant gains confirmed by paired per-image tests.
dc.identifier.doi10.3390/math14050792
dc.identifier.issn2227-7390
dc.identifier.issue5
dc.identifier.scopus2-s2.0-105032775828
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/math14050792
dc.identifier.urihttps://hdl.handle.net/11616/108519
dc.identifier.volume14
dc.identifier.wosWOS:001713739000001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofMathematics
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20250329
dc.subjectfrequency domain
dc.subjectFourier transform
dc.subjectdeep learning
dc.subjectsegmentation
dc.titleFrequency-Based Deep Occlusion Awareness Instance Segmentation
dc.typeArticle

Dosyalar