The power of graphs in medicine: Introducing BioGraphSum for effective text summarization

Küçük Resim Yok

Tarih

2024

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Cell Press

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

In biomedicine, the expansive scientific literature combined with the frequent use of abbreviations, acronyms, and symbols presents considerable challenges for text processing and summarization. The Unified Medical Language System (UMLS) has been a go-to for extracting concepts and determining correlations in these studies; hence, the BioGraphSum model introduced in this study aims to reduce this UMLS dependence. Through adoption of an innovative perspective, sentences within a piece of text are graphically conceptualized as nodes, enabling the concept of Malatya centrality to be leveraged. This approach focuses on pinpointing influential nodes on a graph and, by analogy, the most pertinent sentences within the text for summarization. In order to evaluate the performance of the BioGraphSum approach, a corpus was curated that consisted of 450 contemporary scientific research articles available on the PubMed database, aligned with proven research methodology. The BioGraphSum model was subjected to rigorous testing against this corpus in order to demonstrate its capabilities. Preliminary results, especially in the precision-based and f-score-based ROUGE-(1-2), ROUGE-L, and ROUGE-SU metrics reported significant improvements when compared to other existing models considered state-of-the-art in text summarization.

Açıklama

Anahtar Kelimeler

Document summarization, Multi -document summarization, Minimum vertex cover, ROUGE

Kaynak

Heliyon

WoS Q Değeri

N/A

Scopus Q Değeri

Q1

Cilt

10

Sayı

11

Künye