Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems
Abstract—Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems. Many big-data computing applications have been deployed in cloud platforms. These applications normally demand concurrent data transfers among computing nodes for parallel processing. It is important to find the best transfer scheduling leading to the < Final Year Projects 2016 > least data retrieval time – the maximum throughput in other words. However, the existing methods cannot achieve this, because they ignore link bandwidths and the diversity of data replicas and paths. In this paper, we aim to develop a max-throughput data transfer scheduling to minimize the data retrieval time of applications. Speciﬁcally, the problem is formulated into mixed integer programming.