Journal of Northeastern University ›› 2005, Vol. 26 ›› Issue (8): 733-735.DOI: -

• OriginalPaper • Previous Articles     Next Articles

Approach based on domain knowledge to text categorization

Zhu, Jing-Bo (1); Chen, Wen-Liang (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2013-06-24 Revised:2013-06-24 Online:2005-08-15 Published:2013-06-24
  • Contact: Zhu, J.-B.
  • About author:-
  • Supported by:
    -

Abstract: A knowledge-based text categorization method is proposed, taking domain features as textual features to improve text representation function and considering text categorization as aggregation computation procedure. A feature re-selection and re-weighting technique is proposed for text indexing procedure. To learn feature aggregation functions from labeled training collection automatically, a learning method based on mutual information is employed. Comparative experiment results showed that the text categorization method based on domain knowledge works better than the conventional naive Bayes classifier based on bag-of-words model as a whole and that using domain knowledge will improve effectiveness of classifying similar or antithetical topics.

CLC Number: