Uncertain Data Clustering in Distributed Peer-to-Peer Networks
Abstract-In general, Uncertain data clustering has been recognized as an essential task in the research of data mining. Many centralized clustering algorithms are extended by defining new distance or similarity measurements to tackle this issue. With the fast development of network applications, these centralized methods show their limitations in conducting data clustering in a large dynamic distributed peer-to-peer network due to the privacy and security concerns or the technical constraints brought by distributive environments. A novel distributed uncertain data clustering algorithm, in which the centralized global clustering solution is approximated by per-forming distributed clustering. To shorten the execution time, the reduction technique is then applied to transform the pro-posed method into its deterministic form by replacing each uncertain data object with its expected centroid. Finally, the attribute-weight-entropy regularization technique enhances the proposed distributed clustering method to achieve better results in data clustering and extract the essential features for cluster identification. The experiments on both synthetic and real-world data have shown the efficiency and superiority of the presented algorithm.
sales on Site11,021