A new multi-document summarisation approach using saplings growing-up optimisation algorithms: Simultaneously optimised coverage and diversity

Küçük Resim Yok

Tarih

2024

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Sage Publications Ltd

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Automatic text summarisation is obtaining a subset that accurately represents the main text. A quality summary should contain the maximum amount of information while avoiding redundant information. Redundancy is a severe deficiency that causes unnecessary repetition of information within sentences and should not occur in summarisation studies. Although many optimisation-based text summarisation methods have been proposed in recent years, there exists a lack of research on the simultaneous optimisation of scope and redundancy. In this context, this study presents an approach in which maximum coverage and minimum redundancy, which form the two key features of a rich summary, are modelled as optimisation targets. In optimisation-based text summarisation studies, different conflicting objectives are generally weighted or formulated and transformed into single-objective problems. However, this transformation can directly affect the quality of the solution. In this study, the optimisation goals are met simultaneously without transformation or formulation. In addition, the multi-objective saplings growing-up algorithm (MO-SGuA) is implemented and modified for text summarisation. The presented approach, called Pareto optimal, achieves an optimal solution with simultaneous optimisation. Experimentation with the MO-SGuA method was tested using open-access (document understanding conference; DUC) data sets. Performance success of the MO-SGuA approach was calculated using the recall-oriented understudy for gisting evaluation (ROUGE) metrics and then compared with the competitive practices used in the literature. Testing achieved a 26.6% summarisation result for the ROUGE-2 metric and 65.96% for ROUGE-L, which represents an improvement of 11.17% and 20.54%, respectively. The experimental results showed that good-quality summaries were achieved using the proposed approach.

Açıklama

Anahtar Kelimeler

Content coverage, document summarisation, document understanding conference, information diversity, multi-criteria optimisation, multi-document summarisation, optimisation model, recall-oriented understudy for gisting evaluation, saplings growing-up algorithm

Kaynak

Journal of Information Science

WoS Q Değeri

Q3

Scopus Q Değeri

Q1

Cilt

50

Sayı

3

Künye