0


联邦学习安全防御之同态加密

本博客地址:https://security.blog.csdn.net/article/details/124110931

一、Paillier半同态加密算法

同态加密又可以分为全同态加密、些许同态加密和半同态加密三种形式。这其中,由于受到性能等因素的约束,当前在工业界主要使用半同态加密算法。Paillier即属于半同态加密算法,其并不满足乘法同态运算,虽然Paillier算法不是全同态加密的,但是与全同态加密算法(FHE)相比,其计算效率大大提升,因此在工业界被广泛应用。

我们以 x 表示明文,以 [[x]] 表示其对应的密文,那么Paillier 半同态加密算法满足:[[u+v]] = [[u]] +[[v]]

对于Paillier算法的加密损失函数,损失值 L 关于参数 gif.latex?%5Ctheta 的梯度值为:gif.latex?%5Cfrac%7B%5Cpartial%20L%7D%7B%5Cpartial%20%5Ctheta%20%7D%20%3D%20%5Cfrac%7B1%7D%7Bn%7D%20%5Csum%20_%7Bi%3D1%7D%5En%20%28%5Cfrac%7B1%7D%7B4%7D%5Ctheta%20%5E%5Ctop%20x_i%20-%20%5Cfrac%7B1%7D%7B2%7Dy_i%29%20x_i

上式对应的加密梯度为:gif.latex?%5B%5B%5Cfrac%7B%5Cpartial%20L%7D%7B%5Cpartial%20%5Ctheta%20%7D%5D%5D%20%3D%20%5Cfrac%7B1%7D%7Bn%7D%20%5Csum%20_%7Bi%3D1%7D%5En%28%5Cfrac%7B1%7D%7B4%7D%5B%5B%5Ctheta%20%5E%5Ctop%20%5D%5Dx_i%20+%20%5Cfrac%7B1%7D%7B2%7D%5B%5B-1%5D%5Dy_i%29x_i,该式仅涉及加法和数乘运算,因此Paillier算法适用于经过多项式近似之后的损失函数求解。

二、同态加密防御的具体实现

2.1、定义模型

先自定义一个模型类 LR_Model,以方便我们进行加解密。已对代码做出了具体的注释说明,具体细节阅读代码即可。

models.py

  1. import torch
  2. from torchvision import models
  3. import numpy as np
  4. def encrypt_vector(public_key, x):
  5. return [public_key.encrypt(i) for i in x]
  6. def encrypt_matrix(public_key, x):
  7. ret = []
  8. for r in x:
  9. ret.append(encrypt_vector(public_key, r))
  10. return ret
  11. def decrypt_vector(private_key, x):
  12. return [private_key.decrypt(i) for i in x]
  13. def decrypt_matrix(private_key, x):
  14. ret = []
  15. for r in x:
  16. ret.append(decrypt_vector(private_key, r))
  17. return ret
  18. class LR_Model(object):
  19. def __init__ (self, public_key, w_size=None, w=None, encrypted=False):
  20. # w_size: 权重参数数量
  21. # w: 是否直接传递已有权重,w和w_size只需要传递一个即可
  22. # encrypted: 是明文还是加密的形式
  23. self.public_key = public_key
  24. if w is not None:
  25. self.weights = w
  26. else:
  27. limit = -1.0/w_size
  28. self.weights = np.random.uniform(-0.5, 0.5, (w_size,))
  29. if encrypted==False:
  30. self.encrypt_weights = encrypt_vector(public_key, self.weights)
  31. else:
  32. self.encrypt_weights = self.weights
  33. # 用于更新加密的权重向量
  34. def set_encrypt_weights(self, w):
  35. for id, e in enumerate(w):
  36. self.encrypt_weights[id] = e
  37. # 用于更新明文权重向量
  38. def set_raw_weights(self, w):
  39. for id, e in enumerate(w):
  40. self.weights[id] = e

2.2、(客户端)本地模型训练

在本地的模型训练中,模型参数是在加密的状态下进行,其过程如下所示。已对代码做出了具体的注释说明,具体细节阅读代码即可。

client.py

  1. import models, torch, copy
  2. import numpy as np
  3. from server import Server
  4. class Client(object):
  5. def __init__(self, conf, public_key, weights, data_x, data_y):
  6. self.conf = conf
  7. self.public_key = public_key
  8. self.local_model = models.LR_Model(public_key=self.public_key, w=weights, encrypted=True)
  9. self.data_x = data_x
  10. self.data_y = data_y
  11. def local_train(self, weights):
  12. # 复制服务端下发的全局模型权重
  13. original_w = weights
  14. # 将本地模型的权重更新为全局模型权重
  15. self.local_model.set_encrypt_weights(weights)
  16. neg_one = self.public_key.encrypt(-1)
  17. for e in range(self.conf["local_epochs"]):
  18. print("start epoch ", e)
  19. # 每一轮都随机挑选batch_size大小的训练数据进行训练
  20. idx = np.arange(self.data_x.shape[0])
  21. batch_idx = np.random.choice(idx, self.conf['batch_size'], replace=False)
  22. x = self.data_x[batch_idx]
  23. x = np.concatenate((x, np.ones((x.shape[0], 1))), axis=1)
  24. y = self.data_y[batch_idx].reshape((-1, 1))
  25. # 在加密状态下求取加密梯度
  26. batch_encrypted_grad = x.transpose() * (0.25 * x.dot(self.local_model.encrypt_weights) + 0.5 * y.transpose() * neg_one)
  27. encrypted_grad = batch_encrypted_grad.sum(axis=1) / y.shape[0]
  28. for j in range(len(self.local_model.encrypt_weights)):
  29. self.local_model.encrypt_weights[j] -= self.conf["lr"] * encrypted_grad[j]
  30. weight_accumulators = []
  31. for j in range(len(self.local_model.encrypt_weights)):
  32. weight_accumulators.append(self.local_model.encrypt_weights[j] - original_w[j])
  33. return weight_accumulators

2.3、(服务端)生成公钥和私钥

已对代码做出了具体的注释说明,具体细节阅读代码即可。

server.py

  1. import models, torch
  2. import paillier
  3. import numpy as np
  4. class Server(object):
  5. # 利用paillier算法生成公钥和私钥,公钥用于加密,私钥用于解密
  6. public_key, private_key = paillier.generate_paillier_keypair(n_length=1024)
  7. def __init__(self, conf, eval_dataset):
  8. self.conf = conf
  9. self.global_model = models.LR_Model(public_key=Server.public_key, w_size=self.conf["feature_num"]+1)
  10. self.eval_x = eval_dataset[0]
  11. self.eval_y = eval_dataset[1]
  12. def model_aggregate(self, weight_accumulator):
  13. for id, data in enumerate(self.global_model.encrypt_weights):
  14. update_per_layer = weight_accumulator[id] * self.conf["lambda"]
  15. self.global_model.encrypt_weights[id] = self.global_model.encrypt_weights[id] + update_per_layer
  16. def model_eval(self):
  17. total_loss = 0.0
  18. correct = 0
  19. dataset_size = 0
  20. batch_num = int(self.eval_x.shape[0]/self.conf["batch_size"])
  21. self.global_model.weights = models.decrypt_vector(Server.private_key, self.global_model.encrypt_weights)
  22. print(self.global_model.weights)
  23. for batch_id in range(batch_num):
  24. x = self.eval_x[batch_id*self.conf["batch_size"] : (batch_id+1)*self.conf["batch_size"]]
  25. x = np.concatenate((x, np.ones((x.shape[0], 1))), axis=1)
  26. y = self.eval_y[batch_id*self.conf["batch_size"] : (batch_id+1)*self.conf["batch_size"]].reshape((-1, 1))
  27. dataset_size += x.shape[0]
  28. wxs = x.dot(self.global_model.weights)
  29. pred_y = [1.0 / (1 + np.exp(-wx)) for wx in wxs]
  30. pred_y = np.array([1 if pred > 0.5 else -1 for pred in pred_y]).reshape((-1, 1))
  31. correct += np.sum(y == pred_y)
  32. acc = 100.0 * (float(correct) / float(dataset_size))
  33. return acc
  34. # 对数据进行重新加密
  35. # 先利用paillier生成私钥解密,再利用公钥重新加密
  36. @staticmethod
  37. def re_encrypt(w):
  38. return models.encrypt_vector(Server.public_key, models.decrypt_vector(Server.private_key, w))

本文转载自: https://blog.csdn.net/wutianxu123/article/details/124110931
版权归原作者 武天旭 所有, 如有侵权,请联系我们删除。

“联邦学习安全防御之同态加密”的评论:

还没有评论