Privacy-Preserving Collaborative Model Learning: The Case of Word Vector Training
Abstract-Nowadays machine learning is becoming a new paradigm for mining hidden knowledge in big data. The collection and manipulation of big data not only create considerable values, but also raise serious privacy concerns. To protect the huge amount of potentially sensitive data, a straightforward approach is to encrypt data with specialized cryptographic tools. However, it is challenging to utilize or operate on encrypted data, especially to perform machine learning algorithms. In this paper, we investigate the problem of training high quality word vectors over large-scale encrypted data (from distributed data owners) with the privacy-preserving
collaborative neural network learning algorithms. We leverage and also design a suite of arithmetic primitives (e.g., multiplication, fixed-point representation and sigmoid function computation etc.) on encrypted data, served as components of our construction. We theoretically analyze the security and efficiency of our proposed construction, and conduct extensive experiments on representative real-world datasets to verify its practicality and effectiveness.
sales on Site11,021