Journal of Northeastern University ›› 2011, Vol. 32 ›› Issue (5): 626-629.DOI: -

• OriginalPaper • Previous Articles     Next Articles

Parallel query for a data warehouse utilizing MapReduce

Shi, Jin-Gang (1); Bao, Yu-Bin (1); Leng, Fang-Ling (1); Yu, Ge (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110819, China
  • Received:2013-06-19 Revised:2013-06-19 Published:2013-04-04
  • Contact: Shi, J.-G.
  • About author:-
  • Supported by:
    -

Abstract: MapReduce is a highly efficient distributed and parallel computing framework, allowing users to readily manage large clusters in parallel computing. But the MapReduce framework is not compatible with traditional relational databases. This paper proposes a distributed relational database ChunkDB based on the chunk structure, and extends and redesigns the MapReduce framework to ensure compatibility with the ChunkDB database. Thus, scalability, ease of operation, the high parallelism of MapReduce were integrated with the advantages, including indexing, query optimization of a relational database. The ChunkDB database based on MapReduce provided fast and efficient parallel query for data warehouse applications.

CLC Number: