当前位置:首页|资讯|AIGC|苹果

AIGC: Composable Diffusion Models 笔记

作者:刹那-Ksana-发布时间:2023-09-14

Energy Based Models

E_%5Ctheta(%5Cmathbf%7Bx%7D)%3A%5Cmathbb%7BR%7D%5ED%20%5Cto%20%5Cmathbb%7BR%7D%5E1 . 这个能量函数我们可以用一个神经网络去代表它(但不限于神经网络)。

p_%5Ctheta(%5Cmathbf%7Bx%7D)%5Cpropto%20e%5E%7B-E_%5Ctheta(%5Cmathbf%7Bx%7D)%7D. 这个"概率分布" e%5E%7B-E_%5Ctheta(x)%7D 是没有经过标准化(normalization)的(对 x 的积分不为1)。我们可以让它除以一个 Z_%5Ctheta%3D%5Cint_%5Cmathbf%7Bx%7D%20e%5E%7B-E_%5Ctheta(%5Cmathbf%7Bx%7D)%7Dd%5Cmathbf%7Bx%7D 使其标准化,但是这个积分通常是求不出来的,而且也不在我们这次的考虑范围内。

p_%5Ctheta(%5Cmathbf%7Bx%7D)%5Cpropto%20e%5E%7B-E_%5Ctheta(%5Cmathbf%7Bx%7D)%7D 我们可以看出,如果 x 取自于原数据的分布(比如说,原数据是一堆苹果的图片,x 也是苹果的图片),那么 e%5E%7B-E_%5Ctheta(%5Cmathbf%7Bx%7D)%7D 应该是一个比较大的值,换言之,E_%5Ctheta(%5Cmathbf%7Bx%7D) 应该是一个比较小的值。反之亦然。

当我们有了如上的能量函数以后,我们可以利用朗之万动力学(Langevin dynamics)对其进行采样:

%5Ctilde%7Bx%7D%5E%7Bk%7D%3D%5Ctilde%7Bx%7D%5E%7Bk-1%7D-%5Cfrac%7B%5Clambda%7D%7B2%7D%5Cnabla_%5Cmathbf%7Bx%7DE_%5Ctheta%20(%5Ctilde%7Bx%7D%5E%7Bk-1%7D)%2B%5Comega%5Ek%2C%20%5Comega%5Ek%20%5Csim%20%5Cmathcal%7BN%7D(0%2C%5Clambda)

c_1%2Cc_2%2C...%2Cc_n(比如说长相、肤色、发色等等), 我们针对每一个特征,都训练了其相应的能量函数 E_%5Ctheta(%5Cmathbf%7Bx%7D%7Cc_1)%2CE_%5Ctheta(%5Cmathbf%7Bx%7D%7Cc_2)%2C...%2CE_%5Ctheta(%5Cmathbf%7Bx%7D%7Cc_n). 当我们要组合这些特征时(Conjunction),我们可以直接把相应的"概率分布"相乘:

p(%5Cmathbf%7Bx%7D%7Cc_1%20%5Ctext%7B%20and%20%7D%20c_2%20%5Ctext%7B%20and%20%7D%2C...%2C%20%5Ctext%7B%20and%20%7Dc_i)%3D%5Cprod_i%20p(x%7Cc_i)%20%5Cpropto%20e%5E%7B-%5Csum_i%20E(x%7Cc_i)%7D

所以我们对应的采样公式为:

%5Ctilde%7Bx%7D%5E%7Bk%7D%3D%5Ctilde%7Bx%7D%5E%7Bk-1%7D-%5Cfrac%7B%5Clambda%7D%7B2%7D%5Cnabla_%5Cmathbf%7Bx%7D%20%5Csum_i%20E_%5Ctheta%20(%5Ctilde%7Bx%7D%5E%7Bk-1%7D%7Cc_i)%2B%5Comega%5Ek%2C%20%5Comega%5Ek%20%5Csim%20%5Cmathcal%7BN%7D(0%2C%5Clambda)

除了组合这些特征以外,我们还有“或”运算:

%5Cbegin%7Balign*%7D%0A%26p(x%7Cc_1%20%5Ctext%7B%20or%20%7D%20c_2%20%5Ctext%7B%20or%20%7D%2C...%2C%5Ctext%7B%20or%20%7D%20c_i)%5Cpropto%20%5Csum_i%20p(x%7Cc_i)%2FZ(c_i)%20%5C%5C%0A%26%5Csum_i%20p(x%7Cc_i)%20%5Cpropto%20%5Csum_i%20e%5E%7B-E(x%7Cc_i)%7D%20%3De%5E%7B%5Ctext%7Blogsumexp%7D(-E(x%7Cc_1)%2C-E(x%7Cc_2)%2C...%2C-E(x%7Cc_i))%7D%20%5C%5C%0A%26%20%5Ctilde%7Bx%7D%5E%7Bk%7D%3D%5Ctilde%7Bx%7D%5E%7Bk-1%7D-%5Cfrac%7B%5Clambda%7D%7B2%7D%5Cnabla_%5Cmathbf%7Bx%7D%20%5Ctext%7Blogsumexp%7D(-E(x%7Cc_1)%2C-E(x%7Cc_2)%2C...%2C-E(x%7Cc_i))%2B%5Comega%5Ek%2C%20%5Comega%5Ek%20%5Csim%20%5Cmathcal%7BN%7D(0%2C%5Clambda)%0A%5Cend%7Balign*%7D

和“非”运算:

%5Cbegin%7Balign*%7D%0A%26p(x%7C(%5Ctext%7Bnot%20%7D%20c_1)%2C%20c_2)%5Cpropto%20%5Cfrac%7Bp(x%7Cc_2)%7D%7Bp(x%7Cc_1)%5E%5Calpha%7D%20%5Cpropto%20e%5E%7B%5Calpha%20E(x%7Cc_1)-E(x%7Cc_2)%7D%20%5C%5C%0A%26%20%5Ctilde%7Bx%7D%5E%7Bk%7D%3D%5Ctilde%7Bx%7D%5E%7Bk-1%7D-%5Cfrac%7B%5Clambda%7D%7B2%7D%5Cnabla_%5Cmathbf%7Bx%7D%20%20(%5Calpha%20E(x%7Cc_1)-E(x%7Cc_2))%2B%5Comega%5Ek%2C%20%5Comega%5Ek%20%5Csim%20%5Cmathcal%7BN%7D(0%2C%5Clambda)%0A%5Cend%7Balign*%7D

Composing Diffusion Models

%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20 和 %5Cnabla_%7B%5Cmathbf%7Bx%7D%7D%20E_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t) 有着相似的作用,所以去噪过程的公式:

%5Cmathbf%7Bx%7D_%7Bt-1%7D%20%3D%20%7B%5Cmathbf%7Bx%7D%7D_t%20-%20%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20%2B%20%5Cmathcal%7BN%7D(0%2C%20%5Csigma_t%5E2%20I)

和上面的朗之万公式是等价的。

这里论文中列举了两种运算,首先是“与”运算(AND):

p(%7B%5Cmathbf%7Bx%7D%7D%7C%7B%5Cmathbf%7Bc%7D%7D_1%2C%20%5Cldots%2C%20%7B%5Cmathbf%7Bc%7D%7D_n)%20%5Cpropto%20p(%7B%5Cmathbf%7Bx%7D%7D)%20%5Cprod_%7Bi%3D1%7D%5En%20%5Cfrac%7Bp(%7B%5Cmathbf%7Bx%7D%7D%7C%7B%5Cmathbf%7Bc%7D%7D_i)%7D%7Bp(%7B%5Cmathbf%7Bx%7D%7D)%7D

%5Cnabla_%7B%5Cmathbf%7Bx%7D%7D%20%5Clog%20p(%5Cmathbf%7Bx%7D)%20%3D%5Cepsilon_%5Ctheta(x_t%2Ct) 我们有:

%5Chat%7B%5Cepsilon%7D(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20%3D%20%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20%2B%20%5Csum_%7Bi%3D1%7D%5En%20w_i%20%5Cbigl(%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t%20%7C%20%7B%5Cmathbf%7Bc%7D%7D_i)%20-%20%5Cepsilon_%7B%5Ctheta%7D(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%5Cbigl)

然后是“非”运算(NOT):

p(%7B%5Cmathbf%7Bx%7D%7D%7C%5Ctext%7Bnot%20%7D%5Ctilde%7B%7B%5Cmathbf%7Bc%7D%7D%7D_j%2C%20%7B%5Cmathbf%7Bc%7D%7D_i)%20%5Cpropto%20p(%7B%5Cmathbf%7Bx%7D%7D)%5Cfrac%7Bp(%7B%5Cmathbf%7Bx%7D%7D%20%7C%20%7B%5Cmathbf%7Bc%7D%7D_i)%7D%7Bp(%7B%5Cmathbf%7Bx%7D%7D%20%7C%20%5Ctilde%7B%7B%5Cmathbf%7Bc%7D%7D%7D_j)%7D

%5Chat%7B%5Cepsilon%7D(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20%3D%20%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%20%2B%20w%20%5Cbigl(%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t%20%7C%20%7B%5Cmathbf%7Bc%7D%7D_i)%20-%20%5Cepsilon_%5Ctheta(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t%20%7C%20%5Ctilde%7B%7B%5Cmathbf%7Bc%7D%7D%7D_j)%5Cbigl)

最后,迭代的方式即:

p_%7Bcompose%7D(%20%7B%5Cmathbf%7Bx%7D%7D_%7Bt-1%7D%7C%7B%5Cmathbf%7Bx%7D%7D_t)%20%3A%3D%20%7B%5Cmathcal%7BN%7D%7D%20%5Cbigl(%7B%5Cmathbf%7Bx%7D%7D_t%20-%20%5Chat%7B%5Cepsilon%7D(%7B%5Cmathbf%7Bx%7D%7D_t%2C%20t)%2C%20%5Csigma_t%5E2%20I%20%5Cbigl)

完。

B站公式编辑器经常抽风,如果遇到 tex parse error 之类的错误时,刷新一下页面一般就能解决。


Copyright © 2024 aigcdaily.cn  北京智识时代科技有限公司  版权所有  京ICP备2023006237号-1