iShuffle: Improving Hadoop Performance with Shuffle-on-Write
Abstract– Secret sharing scheme has been applied commonly in distributed storage for Big Data. It is a method for protecting outsourced data against data leakage and for securing key management systems. The secret is distributed among a group of participants where each participant holds a share of the secret. The secret can be only reconstructed when a sufficient number of shares are reconstituted. Although many secret sharing schemes have been proposed, they are still inefficient in terms of share size, communication cost and storage cost; and also lack robustness in terms of exact-share repair. In this paper, for the first time, we propose a new secret sharing scheme based on Slepian-Wolf coding. Our scheme can achieve an optimal share size utilizing the simple binning idea of the coding. It also enhances the exact-share repair feature whereby the shares remain consistent even if they are corrupted. We show, through experiments, how our scheme can significantly reduce the communication and storage cost while still being able to support direct share repair leveraging lightweight exclusive-OR (XOR) operation for fast computation.
sales on Site11,021