东北大学学报(自然科学版) ›› 2004, Vol. 25 ›› Issue (3): 231-234.DOI: -

• 论著 • 上一篇    下一篇

并行后缀树的构造及查询算法

乔百友;葛健;王国仁;韩东红   

  1. 东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院 辽宁沈阳 110004
  • 收稿日期:2013-06-24 修回日期:2013-06-24 出版日期:2004-03-15 发布日期:2013-06-24
  • 通讯作者: Qiao, B.-Y.
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(60273079)

Parallel construction and inquiry algorithm of suffix trees

Qiao, Bai-You (1); Ge, Jian (1); Wang, Guo-Ren (1); Han, Dong-Hong (1)   

  1. (1) Sch. of Info. Sci. and Eng., Northeastern Univ., Shenyang 110004, China
  • Received:2013-06-24 Revised:2013-06-24 Online:2004-03-15 Published:2013-06-24
  • Contact: Qiao, B.-Y.
  • About author:-
  • Supported by:
    -

摘要: 针对生物信息领域中传统后缀树构造算法在时间和空间上的限制,从结构并行的角度提出了一种新颖的、适用于生物信息学应用的并行后缀树结构和相应的构造算法·该算法首先将给定字符串分成若干连续的片段,并在各个处理机上分别构造这些片段的后缀树,形成了一种分布于多个处理机上的并行后缀树结构·该并行算法不仅大大缩短了后缀树的构造时间,而且避免了主存大小的限制·经分析,其性能优于现有的任何一种并行算法·在此基础上,提出了一种高效的基于这种并行后缀树的字符串匹配算法,解决了传统后缀树的基本查询问题·

关键词: 后缀树, 并行构造, 字符串匹配, 生物序列, 生物信息学

Abstract: A parallel suffix tree constructing algorithm is proposed to get rid of traditional space/time restriction while using suffix trees in bioinformatics. In this algorithm, the given string is divided into several continuous substrings. Then, the suffix trees for every substring are constructed in parallel, thus forming a suffix tree structure distributed separately on several processors. This algorithm can not only reduce the time needed to construct suffix trees but also avoid the restriction of main memory. The performance analysis shows that this algorithm outperforms any of existing parallel algorithms. Based on such a suffix tree structure, an efficient pattern matching algorithm is also proposed for the inquiries about traditional suffix trees.

中图分类号: