东北大学学报(自然科学版) ›› 2011, Vol. 32 ›› Issue (5): 626-629.DOI: -

• 论著 • 上一篇    下一篇

基于MapReduce的关系型数据仓库并行查询

师金钢;鲍玉斌;冷芳玲;于戈;   

  1. 东北大学信息科学与工程学院;
  • 收稿日期:2013-06-19 修回日期:2013-06-19 发布日期:2013-04-04
  • 通讯作者: -
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(60773222)

Parallel query for a data warehouse utilizing MapReduce

Shi, Jin-Gang (1); Bao, Yu-Bin (1); Leng, Fang-Ling (1); Yu, Ge (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110819, China
  • Received:2013-06-19 Revised:2013-06-19 Published:2013-04-04
  • Contact: Shi, J.-G.
  • About author:-
  • Supported by:
    -

摘要: 针对MapReduce框架与传统关系型数据库兼容性不好的问题,提出了一种基于分块结构的分布式关系数据库ChunkDB.并对MapReduce架构进行了扩展设计,使ChunkDB与MapReduce有效结合,将MapReduce的扩展性、易操作性、高并行性与关系数据库的索引等查询优化优势相结合.实验证明基于MapReduce的ChunkDB数据库能够为数据仓库应用提供快速高效的并行查询.

关键词: MapReduce, 数据仓库, 并行计算, 分布式数据库, 查询优化

Abstract: MapReduce is a highly efficient distributed and parallel computing framework, allowing users to readily manage large clusters in parallel computing. But the MapReduce framework is not compatible with traditional relational databases. This paper proposes a distributed relational database ChunkDB based on the chunk structure, and extends and redesigns the MapReduce framework to ensure compatibility with the ChunkDB database. Thus, scalability, ease of operation, the high parallelism of MapReduce were integrated with the advantages, including indexing, query optimization of a relational database. The ChunkDB database based on MapReduce provided fast and efficient parallel query for data warehouse applications.

中图分类号: