Please use this identifier to cite or link to this item:
https://digital.lib.ueh.edu.vn/handle/UEH/68745
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ximing Li | - |
dc.contributor.other | Bing Wang | - |
dc.contributor.other | Yue Wang | - |
dc.contributor.other | Jihong Ouyang | - |
dc.contributor.other | Harish Garg | - |
dc.contributor.other | Dang N. H. Thanh | - |
dc.date.accessioned | 2023-05-30T02:27:28Z | - |
dc.date.available | 2023-05-30T02:27:28Z | - |
dc.date.issued | 2023 | - |
dc.identifier.issn | 1432-7643 (Print), 1433-7479 (Online) | - |
dc.identifier.uri | https://digital.lib.ueh.edu.vn/handle/UEH/68745 | - |
dc.description.abstract | Dataless text classification, i.e., a new paradigm of weakly supervised learning, refers to the task of learning with unlabeled documents and a few predefined representative words of categories, known as seed words. The recent generative dataless methods construct document-specific category priors by using seed word occurrences only; however, such category priors often contain very limited and even noisy supervised signals. To remedy this problem, in this paper, we propose a novel formulation of category prior. First, for each document, we consider its label membership degree by not only counting seed word occurrences, but also using a novel prototype scheme, which captures pseudo-nearest neighboring categories. Second, for each label, we consider its frequency prior knowledge of the corpus, which is also a discriminative knowledge for classification. By incorporating the proposed category prior into the previous generative dataless method, we suggest a novel generative dataless method, namely Weakly Supervised Prototype Topic Model. The experimental results on real-world datasets demonstrate that WSPTM outperforms the existing baseline methods. | en |
dc.format | Portable Document Format (PDF) | - |
dc.language | eng | - |
dc.publisher | Springer | - |
dc.relation.ispartof | Soft Computing | - |
dc.relation.ispartofseries | Vol. 27 | - |
dc.rights | Springer Nature Switzerland AG. | vi |
dc.subject | Dataless text classification | - |
dc.subject | Topic modeling | - |
dc.subject | Seed words | - |
dc.subject | Category prior | - |
dc.subject | Prototype scheme | - |
dc.title | Weakly supervised prototype topic model with discriminative seed words: modifying the category prior by self-exploring supervised signals | - |
dc.type | Journal Article | - |
dc.identifier.doi | https://doi.org/10.1007/s00500-022-07771-9 | - |
ueh.JournalRanking | ISI, Scopus | - |
item.openairecristype | http://purl.org/coar/resource_type/c_18cf | - |
item.grantfulltext | none | - |
item.cerifentitytype | Publications | - |
item.fulltext | Only abstracts | - |
item.openairetype | Journal Article | - |
Appears in Collections: | INTERNATIONAL PUBLICATIONS |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.