GAN

intro

GAN 李宏毅Hung-yi Lee

why distribution? the same input has different outputs. especially for tasks needs creativity. 输出分布
生成对抗网络. 枯叶蝶。不断进化 adversarial. 写作敌人，念做朋友。希望调整 network 让输出的分数越大越好。固定一个训练另一个。Discriminator learns to assign high scores to real objects and low scores to generated objects。Generator learns to “fool” the discriminator
机器可以学到说往左看的人脸和往右看的人脸做内差可以训练到往中间看的人脸
看到真的图给高分看到生成的图给低分

theory behind GAN

objective

The maximum objective value is related to JS divergence.
using the divergence you like
GAN is difficult to train
Js divergence is not suitable. 明明中间那个比左边好但是从 js divergence 看不出来都是 log2
Wasserstein distance. d. Using the “moving plan” with the smallest average distance to define the Wasserstein distance. 距离d0 d1 d2
WGAN. has to be smooth enough 所以 real 和 generative 的值不能差很大不能一个很高一个很低。这样算出来的 expendence 就比较小
Generator and Discriminator needs to match each other (棋逢敵手). (cannot fool the discriminator. cannot tell the difference)
想要 train gan 时可以去参考这些论文。Tips from Soumith • https://github.com/soumith/ganhacks • Tips in DCGAN: Guideline for network architecture design for image generation • https://arxiv.org/abs/1511.06434 • Improved techniques for training GANs • https://arxiv.org/abs/1606.03498 • Tips from BigGAN • https://arxiv.org/abs/1809.11096
用 gan 生成一段文字最困难
当 decoder 参数有改变一点而最终输出没有变化就不能做微分（但 cnn 里面有 max pooling 也可以做思考为什么）
gan评估方式。1⃣️ hw6跑一个人脸侦测的系统看能抓到多少你生成的人脸。2⃣️ 跑一个影响分类的系统。几率分布越集中说明跑出来的越好说明他越肯定如果看不出来是猫是狗就会很困惑分布奇怪。不过也会遇到kode collapse的问题。是discriminator的盲点他避开不了这里成功被欺骗。这个问题现在还待解决。

conditional GAN

Multi-label Image Classifier = Conditional Generator

learning from unpaired data

Can we learn the mapping without any paired data? Unsupervised Conditional Generation
cycle GAN
- 两个generator，为了让第二个generator效果好，要让第一个generator产生的和input越接近越好。

starGAN

evaluation of generation

Diversity - Mode Collapse
diversity——mode dropping 皮肤色差太容易被侦测了。来来去去就那么几张看多了就觉得是生成的了
需要去量说看产生的图 diversity 够不够
一张图片丢到 classifier 看分布有没有很集中而 diversity 是看一堆图片分布越平均 diversity 越大
距离越小代表图片越接近。fid 需要大量运算量
We don’t want memory GAN.有时候产生的结果很好但那就是训练库有的。generator 是想产生新的。有人说比如比相似度啊但可能他把你的图片翻转了这样也比较难。
要评估 generator 好不好这是个问题
会加一个 discriminator 看图片和描述有没有匹配。如果单纯用 gan 可能他想象力丰富创造出没有的东西。比较好的方案 gan 和 supervised 结合

我觉得一门好的课是不仅可以告诉你然告诉你所以然还能引导你去思考新的不断提问再去解答而我就会思考我能想到什么解决办法呢我为什么没有想到可能存在的这个问题呢。而不是一味的灌输知识。