A novel hybrid attention gate based on vision transformer for the detection of surface defects

dc.contributor.authorUzen, Hueseyin
dc.contributor.authorTurkoglu, Muammer
dc.contributor.authorOzturk, Dursun
dc.contributor.authorHanbay, Davut
dc.date.accessioned2024-08-04T20:56:06Z
dc.date.available2024-08-04T20:56:06Z
dc.date.issued2024
dc.departmentİnönü Üniversitesien_US
dc.description.abstractMany advanced models have been proposed for automatic surface defect inspection. Although CNN-based methods have achieved superior performance among these models, it is limited to extracting global semantic details due to the locality of the convolution operation. In addition, global semantic details can achieve high success for detecting surface defects. Recently, inspired by the success of Transformer, which has powerful abilities to model global semantic details with global self-attention mechanisms, some researchers have started to apply Transformer-based methods in many computer-vision challenges. However, as many researchers notice, transformers lose spatial details while extracting semantic features. To alleviate these problems, in this paper, a transformer-based Hybrid Attention Gate (HAG) model is proposed to extract both global semantic features and spatial features. The HAG model consists of Transformer (Trans), channel Squeeze-spatial Excitation (sSE), and merge process. The Trans model extracts global semantic features and the sSE extracts spatial features. The merge process which consists of different versions such as concat, add, max, and mul allows these two different models to be combined effectively. Finally, four versions based on HAG-Feature Fusion Network (HAG-FFN) were developed using the proposed HAG model for the detection of surface defects. The four different datasets were used to test the performance of the proposed HAG-FFN versions. In the experimental studies, the proposed model produced 83.83%, 79.34%, 76.53%, and 81.78% mIoU scores for MT, MVTec-Texture, DAGM, and AITEX datasets. These results show that the proposed HAGmax-FFN model provided better performance than the state-of-the-art models.en_US
dc.description.sponsorshipInn niversitesien_US
dc.description.sponsorshipNo Statement Availableen_US
dc.identifier.doi10.1007/s11760-024-03355-2
dc.identifier.issn1863-1703
dc.identifier.issn1863-1711
dc.identifier.scopus2-s2.0-85196057905en_US
dc.identifier.scopusqualityQ2en_US
dc.identifier.urihttps://doi.org/10.1007/s11760-024-03355-2
dc.identifier.urihttps://hdl.handle.net/11616/102058
dc.identifier.wosWOS:001248625600004en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherSpringer London Ltden_US
dc.relation.ispartofSignal Image and Video Processingen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectDefects detectionen_US
dc.subjectVision transformersen_US
dc.subjectSqueeze and excitationen_US
dc.subjectEncoder decoder networken_US
dc.subjectConvolutional neural networken_US
dc.titleA novel hybrid attention gate based on vision transformer for the detection of surface defectsen_US
dc.typeArticleen_US

Dosyalar