Journal of Northeastern University Natural Science ›› 2019, Vol. 40 ›› Issue (6): 795-800.DOI: 10.12068/j.issn.1005-3026.2019.06.007

• Information & Control • Previous Articles     Next Articles

A Distributed File System Based on HDFS

LIU Jun1, LENG Fang-ling2, LI Shi-qi2, BAO Yu-bin2   

  1. 1. Information Construction and Network Security Office, Northeastern University, Shenyang 110819, China; 2. School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China.
  • Received:2018-04-25 Revised:2018-04-25 Online:2019-06-15 Published:2019-06-14
  • Contact: LENG Fang-ling
  • About author:-
  • Supported by:
    -

Abstract: This paper establishes an intelligent big data storage system IHDFS, based on the existing open source distributed file storage system HDFS. The system proposes and implements big data de-duplication module, big data placement module, big data intelligent migration module, and big data encoding module, which improves the efficiency of user visits and saves the storage space of the cluster. Experimental results show that the data de-duplication module can save the storage space. The data placement module provides a reasonable distribution of file upload storage layer, which twice the uploading speed; the data intelligent migration module improves the hit rate of files on the upper storage layer, which improves the efficiency of obtaining data; the data encoding module saves the storage space of the cluster about one third of the original.

Key words: multi-layer storage architecture, HDFS, intelligence, optimization, distributed

CLC Number: