MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction

2021-10-13 11:29:17

Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Shiliang Zhang, Bing Li, Wei Wang, Xin Cao

arXiv_CL

arXiv_CL Embedding Bert Unsupervised Pose Action Self-Supervised

Abstract
Abstract (translated)
URL
PDF

Abstract

Keyphrases are phrases in a document providing a concise summary of core content, helping readers to understand what the article is talking about in a minute. However, existing unsupervised works are not robust enough to handle various types of documents owing to the mismatch of sequence length for comparison. In this paper, we propose a novel unsupervised keyword extraction method by leveraging the BERT-based model to select and rank candidate keyphrases with a MASK strategy. In addition, we further enhance the model, denoted as Keyphrases Extraction BERT (KPEBERT), via designing a compatible self-supervised task and conducting a contrast learning. We conducted extensive experimental evaluation to demonstrate the superiority and robustness of the proposed method as well as the effectiveness of KPEBERT.

Abstract (translated)

URL

https://arxiv.org/abs/2110.06651

PDF

https://arxiv.org/pdf/2110.06651.pdf