[CCS 2016] Deep Learning with DP

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Federated Learning

[CCS 2016] Deep Learning with DP - (3) 본문

Privacy Preserving/Papers

[CCS 2016] Deep Learning with DP - (3)

pseudope 2022. 9. 26. 13:30

728x90

논문 제목: Deep Learning with Differential Privacy

출처: https://arxiv.org/abs/1607.00133

지난 포스트에서 DPSGD의 작동 방식을 확인하고, privacy cost를 계산하는 기법 중 하나인 Moments Accountant를 알아보았습니다. (이전 글 보기) 이번 포스트에서는 이 Moments Accountant를 조금 더 엄밀하게 살펴보도록 하겠습니다.

4. Moments Accountant - 이어서

우선, 이웃한 두 database $d, d' \in \mathcal{D}^n$ , mechanism $\mathcal{M}$ , 부가적인 정보 $\text{Aux}$ , $\mathcal{M}$ 의 output $\mathcal{o} \in \mathbb{R}$ 에 대해서, $\mathcal{M}$ 의 privacy loss $\mathcal{c}$ 를 다음과 같이 정의합니다. (이는 DP의 정의로부터 자연스럽습니다.)

$\mathcal{c} (\mathcal{o}; \mathcal{M}, \text{Aux}, d, d') := \log \frac {\text{Pr} [\mathcal{M} (\text{Aux}, d) = \mathcal{o}]} {\text{Pr} [\mathcal{M} (\text{Aux}, d') = \mathcal{o}]}$

여기에서, 각 step을 $\mathcal{M}_i$ 로 생각할 때, $\text{Aux}$ 는 이전 step의 output들, 즉, $\mathcal{o}_{1}, \cdots, \mathcal{o}_{i-1}$ 이 될 것입니다. (물론, $\mathcal{M}_1$ 에서는 $\text{Aux}$ 를 고려하지 않아도 좋습니다.) 그리고 이러한 $\mathcal{c}$ 는 random noise에 dependent한 random variable입니다. 우리는 이러한 확률변수 $c$ 에 대한 moment의 log값인 $\alpha_{\mathcal{M}}$ 를 정의할 수 있습니다.

$\alpha_{\mathcal{M}} (\lambda ; \text{aux}, d, d') := \log \mathbb{E}_{\mathcal{o} \sim \mathcal{M}(\text{aux}, d)} [e^{\lambda \mathcal{c} (\mathcal{o} ; \mathcal{M}, \text{aux}, d, d')}]$

그리고 가능한 모든 $d$ 와 $d'$ 에 대하여 $\alpha_{\mathcal{M}} (\lambda ; \text{aux}, d, d')$ 을 bound할 수 있는 $\alpha_{\mathcal{M}} (\lambda) := \max_{\text{aux}, d, d'} \; \alpha_{\mathcal{M}} (\lambda ; \text{aux}, d, d')$ 를 정의합니다. 이때, 다음 $\text{Theorem 2}$ 는 각 step마다 $\mathcal{c}$ 의 log moment를 bound하는 것만으로 1 epoch의 model 학습 과정 전체의 $(\epsilon, \delta)$ 를 보장할 수 있다고 이야기합니다. (이때, 굳이 moment의 log 값을 사용하는 이유는, moment 값을 바로 사용할 경우 bound가 매우 loose해지기 때문이라고 합니다.)

$\text{Theorem 2}$

앞서 정의한 $\alpha_{\mathcal{M}} (\lambda)$ 에 대하여, 다음이 성립한다.

1. [Composability]

$\mathcal{M}_i: \prod_{j=1}^{i = 1} \mathbb{R}_j \times \mathcal{D} \rightarrow \mathbb{R}_i$ 일 때, mechanisam $\mathcal{M}$ 이 $\mathcal{M}_1, \mathcal{M}_2, \cdots, \mathcal{M}_k$ 로 구성되어 있다고 가정하자. 그러면 임의의 $\lambda$ 에 대하여, $\alpha_{\mathcal{M}} (\lambda) \leq \sum_{i=1}^k \alpha_{\mathcal{M}_i} (\lambda)$ 이다.

2. [Tail Bound]

임의의 $\epsilon > 0$ 에 대하여, 만약 $\delta = \min_{\lambda} \; e^{ \left( \alpha_{\mathcal{M}}(\lambda) - \lambda \epsilon \right) }$ 으로 $\delta$ 를 지정한다면, $\mathcal{M}$ 은 $(\epsilon, \delta)$ -DP를 만족한다.

$\text{Proof}$

1. [Composability]

$\mathcal{M}_{1:i} := (\mathcal{M}_1, \mathcal{M}_2, \cdots, \mathcal{M}_i)$ , $\mathcal{o}_{1:i} := (\mathcal{o}_{1}, \mathcal{o}_{2}, \cdots, \mathcal{o}_{i})$ 로 정의하자. 그러면

$\begin{align*} \mathcal{c} (\mathcal{o_{1:k}}; \mathcal{M_{1:k-1}}, \mathcal{o}_{1:k}, d, d') &= \log \frac {\text{Pr} [\mathcal{M} (\mathcal{o}_{1:k-1}, d) = \mathcal{o}_{1:k}]} {\text{Pr} [\mathcal{M} (\mathcal{o}_{1:k-1}, d') = \mathcal{o}_{1:k}]} \\&= \log \prod_{i=1}^k \frac {\text{Pr} [\mathcal{M} (d) = \mathcal{o}_i \mid \mathcal{M}_{1:(i - 1)} (d) = \mathcal{o}_{1:(i-1)}]} {\text{Pr}[\mathcal{M} (d') = \mathcal{o}_i \mid \mathcal{M}_{1:(i - 1)} (d') = \mathcal{o}_{1:(i-1)}]} \\&= \sum_{i=1}^k \log \frac {\text{Pr} [\mathcal{M} (d) = \mathcal{o}_i \mid \mathcal{M}_{1:(i - 1)} (d) = \mathcal{o}_{1:(i-1)}]} {\text{Pr}[\mathcal{M} (d') = \mathcal{o}_i \mid \mathcal{M}_{1:(i - 1)} (d') = \mathcal{o}_{1:(i-1)}]} \\&= \sum_{i=1}^k \mathcal{c} (\mathcal{o}_i ; \mathcal{M}_i, \mathcal{o}_{1: (i-1)}, d, d') \end{align*}$

이므로, 다음을 얻을 수 있다.

$\begin{align*} & \mathbb{E}_{\mathcal{o}'_{1:k} \sim \mathcal{M}_{1:k}(d)} \left[ \exp (\lambda\mathcal{c} (\mathcal{o}'_{1:k}; \mathcal{M}_{1:k}, d, d')) \mid \mathcal{o}'_i = \mathcal{o} \text{ for all } i < k \right] \\&= \mathbb{E}_{\mathcal{o}'_{1:k} \sim \mathcal{M}_{1:k}(d)} \left[ \exp \left( \lambda \sum_{i=1}^k \mathcal{c} (\mathcal{o}_i ; \mathcal{M}_i, \mathcal{o}_{1: (i-1)}, d, d') \right) \right] \\&= \mathbb{E}_{\mathcal{o}'_{1:k} \sim \mathcal{M}_{1:k}(d)} \left[ \prod_{i=1}^k \exp \left( \lambda \mathcal{c} (\mathcal{o}_i ; \mathcal{M}_i, \mathcal{o}_{1: (i-1)}, d, d') \right) \right] \\&= \prod_{i=1}^k \mathbb{E}_{\mathcal{o}'_{1:k} \sim \mathcal{M}_{1:k}(d)} \left[ \exp \left( \lambda \mathcal{c} (\mathcal{o}_i ; \mathcal{M}_i, \mathcal{o}_{1: (i-1)}, d, d') \right) \right] \\&= \prod_{i=1}^k \exp \left( \alpha_{\mathcal{M}_i} (\lambda; \mathcal{o}_{1:(i-1)}, d, d') \right) \; (\because \alpha_{\mathcal{M}} (\lambda ; \text{aux}, d, d') = \log \mathbb{E}_{\mathcal{o} \sim \mathcal{M}(\text{aux}, d)} [e^{\lambda \mathcal{c} (\mathcal{o} ; \mathcal{M}, \text{aux}, d, d')}]) \\&= \exp \left( \sum_{i=1}^k \left( \alpha_{\mathcal{M}_i} (\lambda; \mathcal{o}_{1:(i-1)}, d, d') \right) \right) \end{align*}$

따라서, $\sum_{i=1}^k \alpha_{\mathcal{M}_i} (\lambda)$ 이 $\alpha_{\mathcal{M}} (\lambda)$ 의 upper bound임을 알 수 있다. $\square$

2. [Tail Bound]

Markov's inequality에 의하여 다음이 성립한다. $(1)$

$\begin{align*} \text{Pr}_{\mathcal{o} \sim \mathcal{M} (d)} \left[ \mathcal{c} (\mathcal{o}) \geq \epsilon \right] &= \text{Pr}_{\mathcal{o} \sim \mathcal{M} (d)} \left[ e^{\lambda \mathcal{c} (\mathcal{o})} \geq e^{\lambda \epsilon} \right] \; (\because \lambda > 0) \\&\leq \frac {\mathbb{E}_{\mathcal{o} \sim \mathcal{M} (d)} \left[ e^{\lambda \mathcal{c} (\mathcal{o})} \right]} {e^{\lambda \epsilon}} \; (\because \text{ Markov's Inequality}) \\&\leq \frac {e^{\alpha_{\mathcal{M}} (\lambda)}} {e^{\lambda \epsilon}} = e^{\alpha_{\mathcal{M}} (\lambda) - \lambda \epsilon} \end{align*}$

따라서, $B := \{ \mathcal{o} \mid \mathcal{c} (\mathcal{o}) \geq \epsilon \}$ 로 정의할 때, 임의의 $S$ 에 대해서

$\begin{align*} \text{Pr} [\mathcal{M} (d) \in S] &= \text{Pr} [\mathcal{M} (d) \in S \cap B^C] + \text{Pr} [\mathcal{M} (d) \in S \cap B] \\&\leq e^{\epsilon} \text{Pr} [\mathcal{M} (d') \in S \cap B^C] + \text{Pr} [\mathcal{M} (d) \in S \cap B] \; (\because \epsilon \text{-DP}) \\&\leq e^{\epsilon} \text{Pr} [\mathcal{M} (d') \in S] + \text{Pr} [\mathcal{M} (d) \in B] \; (\because S \cap B^C \subset S, S \cap B \subset B) \\&\leq e^{\epsilon} \text{Pr} [\mathcal{M} (d') \in S] + e^{\alpha_{\mathcal{M}} (\lambda) - \lambda \epsilon} \; (\because (1)) \end{align*}$

이고, 그러므로 $\delta := \min_{\lambda} \; e^{ \left( \alpha_{\mathcal{M}}(\lambda) - \lambda \epsilon \right) }$ 로 $\delta$ 를 지정할 경우 $\mathcal{M}$ 은 $(\epsilon, \delta)$ -DP를 만족한다. $\square$

$\text{Theorem 2.1}$ 덕분에, 우리는 model 전체의 privacy cost를 고려하지 않아도 됩니다. 그 대신에, $\alpha_{\mathcal{M}_i} (\lambda)$ 를 매 $\mathcal{M}_i$ (즉, 모든 step)마다 bound한 후, 이들을 모두 더하는 방법을 택할 것입니다. 그러면 $\text{Theorem 2.2}$ 에 의하여 $\mathcal{M}$ 은 $(\epsilon, \delta)$ -DP가 보장됩니다. 그렇다면, 이제 남은 부분은 $\alpha_{\mathcal{M}_i} (\lambda)$ 를 bound할 수 있는 방법을 찾는 것인데, 이에 관한 이야기는 다음 포스트에서 계속 하도록 하겠습니다. 이 이야기가 마무리되면, 잠시 뒤로 미루어놓았던 $\text{Theorem 1}$ 의 증명을 할 수 있고, 그 이후로 machine learning model들에 마음껏 DP를 사용할 수 있게 될 것입니다.

'Privacy Preserving > Papers' 카테고리의 다른 글

[CCS 2016] Deep Learning with DP - (6) (1)	2022.10.11
[CCS 2016] Deep Learning with DP - (5) (1)	2022.10.07
[CCS 2016] Deep Learning with DP - (4) (4)	2022.10.05
[CCS 2016] Deep Learning with DP - (2) (0)	2022.09.21
[CCS 2016] Deep Learning with DP - (1) (2)	2022.09.19

'Privacy Preserving/Papers' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Federated Learning

Federated Learning

[CCS 2016] Deep Learning with DP - (3) 본문

[CCS 2016] Deep Learning with DP - (3)

'Privacy Preserving > Papers' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역