Hui Guan1, Wen Tang, Hamid Krim2, James Keiser, Andrew J. Rindos, Radmila Sazdanovic
14:30 - 16:00 | Tue 5 Jul | Salisbury C | S6.2
As a useful tool to summarize documents, keyphrase extraction extracts a set of single or multiple words, called keyphrases, that capture the primary topics discussed in a document. In this paper we propose DoCollapse, a topological collapse-based unsupervised keyphrase extraction method that relies on networking document by semantic relatedness of candidate keyphrases. A semantic graph is built with candidates keyphrases as vertices and then reduced to its core using topological collapse algorithm to facilitate final keyphrase selection. Iteratively collapsing dominated vertices aids in removing noisy candidates and revealing important points. We conducted experiments on two standard evaluation datasets composed of scientific papers and found that DoCollapse outperforms state-ofthe-art methods. Results show that simplifying a document graph by homology-preserving topological collapse benefits keyphrase extraction.