TY - JOUR
T1 - Privacy-Preserving Locally Weighted Linear Regression over Encrypted Millions of Data
AU - Dong, Xiaoxia
AU - Chen, Jie
AU - Zhang, Kai
AU - Qian, Haifeng
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2020
Y1 - 2020
N2 - The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear.
AB - The emerging development of cloud computing makes a trend that the cloud becomes a outsourced agglomeration for storing big data that generally contains numerous information. To mine the rich value involved in big data, the machine learning methodology is widespread employed due to its ability to adapt to data changes. However, the data mining process may involve the privacy issues of the users, hence they are reluctant to share their information. This is the reason why the outsourced data need to be dealt with securely, where data encryption is considered to be the most straightforward method to keep the privacy of data, but machine learning on the data in ciphertext domain is more complicated than the plaintext, since the relationship structure between data is no longer maintained, in such a way that we focus on the machine learning over encrypted big data. In this work, we study locally weighted linear regression (LWLR), a widely used classic machine learning algorithm in real-world, such as predict and find the best-fit curve through numerous data points. To tackle the privacy concerns in utilizing the LWLR algorithm, we present a system for privacy-preserving locally weighted linear regression, where the system not only protects the privacy of users but also encrypts the best-fit curve. Therefore, we use Paillier homomorphic encryption as the building modular to encrypt data and then apply the stochastic gradient descent in encrypted domain. After given a security analysis, we study how to let Paillier encryption deal with real numbers and implement the system in Python language with a couple of experiments on real-world data sets to evaluate the effectiveness, and show that it outperforms the state-of-the-art and occurs negligible errors compared with performing locally weighted linear regression in the clear.
KW - Locally weighted linear regression
KW - paillier homomorphic encryption
KW - privacy-preserving
KW - stochastic gradient descent
UR - https://www.scopus.com/pages/publications/85078447886
U2 - 10.1109/ACCESS.2019.2962700
DO - 10.1109/ACCESS.2019.2962700
M3 - 文章
AN - SCOPUS:85078447886
SN - 2169-3536
VL - 8
SP - 2247
EP - 2257
JO - IEEE Access
JF - IEEE Access
M1 - 8943957
ER -