基于贝斯准则和待定词集模糊矩阵的满文识别后处理

doi:-

东北大学学报(自然科学版) ›› 2004, Vol. 25 ›› Issue (11): 1061-1064.DOI: -

基于贝斯准则和待定词集模糊矩阵的满文识别后处理

李晶皎;赵骥

东北大学信息科学与工程学院;东北大学信息科学与工程学院辽宁沈阳　110004

收稿日期:2013-06-24 修回日期:2013-06-24 出版日期:2004-11-15 发布日期:2013-06-24
通讯作者: Zhao, J.
作者简介:-
基金资助:
辽宁省自然科学基金资助项目(2001113)

Manchu character recognition post-processing based on bayes rules and substitution set confusion matrix

Li, Jing-Jiao (1); Zhao, Ji (1)

(1) Sch. of Info. Sci. and Eng., Northeastern Univ., Shenyang 110004, China; (2) Anshan Univ. of Sci. and Technol., Anshan 114002, China

Received:2013-06-24 Revised:2013-06-24 Online:2004-11-15 Published:2013-06-24
Contact: Zhao, J.
About author:-
Supported by:
-

摘要/Abstract

摘要： 将满文单词识别系统的识别信息和满文的词组信息有机地结合起来,建立满文词组和待定词集统计信息库,利用贝叶斯准则,综合满文待定词的后验概率和词组的先验概率信息,建立合理有效便于实现的数据结构,对满文单词识别系统输出存在的拒识词和错识词进行检测和纠正,从而有效地提高满文识别系统的识别率·实验表明:后处理性能除取决于语言模型外,还取决于后概率的精确估计·另外,在单词识别系统识别率高的情况下,后处理的纠错能力会增强·

关键词: 满文, 后处理, 待定词集, 模糊矩阵, 贝叶斯准则, 特征矢量, 词组库

Abstract: After combining of organically the recognition information on single Manchu characters from relevant system with the information on phrases to set up a statistical information database of Manchu phrases and underdetermined word sets, Bayes rules are used to synthesize the prior probability of underdetermined Manchu word sets and posterior probability of phrases. A data construction is thus developed to improve efficiently the recognition rate, which is rational and easy to implement especially available to detect and correct those rejected and incorrectly recognized words output from the SCR single character recognition system. Experiment shows that the post-processing performance depends on not only the language model but the accurate estimate of posterior probability. In addition, the higher the recognition rate of SCR, the stronger the rectifiability of postprocessing.

中图分类号:

李晶皎;赵骥. 基于贝斯准则和待定词集模糊矩阵的满文识别后处理[J]. 东北大学学报(自然科学版), 2004, 25(11): 1061-1064.

Li, Jing-Jiao (1); Zhao, Ji (1) . Manchu character recognition post-processing based on bayes rules and substitution set confusion matrix[J]. Journal of Northeastern University, 2004, 25(11): 1061-1064.

[1]	赵相国;王国仁;. 基于ELM的蛋白质二级结构预测及其后处理[J]. 东北大学学报(自然科学版), 2009, 30(10): 1402-1405.
[2]	张广渊;李晶皎;张俐. 满文罗马转写与圈点满文转换算法的实现[J]. 东北大学学报(自然科学版), 2003, 24(12): 1157-1160.
[3]	张广渊;李晶皎;张俐. 满文矢量字库和罗马转写满文输入法的实现[J]. 东北大学学报(自然科学版), 2003, 24(11): 1033-1036.
[4]	张俐;胡明函;李晶皎;何荣伟. 满汉计算机辅助翻译系统的满文字符编码[J]. 东北大学学报:自然科学版, 2002, 23(2): 119-122.
[5]	张俐;李晶皎;赵欣;王宝库. 开放式满汉辅助翻译系统的研究和实现[J]. 东北大学学报(自然科学版), 1999, 20(6): 587-590.

基于贝斯准则和待定词集模糊矩阵的满文识别后处理

Manchu character recognition post-processing based on bayes rules and substitution set confusion matrix

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics

本文评价