$$ \newcommand{\dint}{\mathrm{d}} \newcommand{\vphi}{\boldsymbol{\phi}} \newcommand{\vpi}{\boldsymbol{\pi}} \newcommand{\vpsi}{\boldsymbol{\psi}} \newcommand{\vomg}{\boldsymbol{\omega}} \newcommand{\vsigma}{\boldsymbol{\sigma}} \newcommand{\vzeta}{\boldsymbol{\zeta}} \renewcommand{\vx}{\mathbf{x}} \renewcommand{\vy}{\mathbf{y}} \renewcommand{\vz}{\mathbf{z}} \renewcommand{\vh}{\mathbf{h}} \renewcommand{\b}{\mathbf} \renewcommand{\vec}{\mathrm{vec}} \newcommand{\vecemph}{\mathrm{vec}} \newcommand{\mvn}{\mathcal{MN}} \newcommand{\G}{\mathcal{G}} \newcommand{\M}{\mathcal{M}} \newcommand{\N}{\mathcal{N}} \newcommand{\S}{\mathcal{S}} \newcommand{\diag}[1]{\mathrm{diag}(#1)} \newcommand{\diagemph}[1]{\mathrm{diag}(#1)} \newcommand{\tr}[1]{\text{tr}(#1)} \renewcommand{\C}{\mathbb{C}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\E}{\mathbb{E}} \newcommand{\D}{\mathcal{D}} \newcommand{\inner}[1]{\langle #1 \rangle} \newcommand{\innerbig}[1]{\left \langle #1 \right \rangle} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\two}{\mathrm{II}} \newcommand{\GL}{\mathrm{GL}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\grad}[1]{\mathrm{grad} \, #1} \newcommand{\gradat}[2]{\mathrm{grad} \, #1 \, \vert_{#2}} \newcommand{\Hess}[1]{\mathrm{Hess} \, #1} \newcommand{\T}{\text{T}} \newcommand{\dim}[1]{\mathrm{dim} \, #1} \newcommand{\partder}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\rank}[1]{\mathrm{rank} \, #1} $$

Fisher Information Matrix

An introduction and intuition of Fisher Information Matrix.

Introduction to Annealed Importance Sampling

An introduction and implementation of Annealed Importance Sampling (AIS).

Gibbs Sampler for LDA

Implementation of Gibbs Sampler for the inference of Latent Dirichlet Allocation (LDA)

Boundary Seeking GAN

Training GAN by moving the generated samples to the decision boundary.

Least Squares GAN

2017 is the year GAN loss its logarithm. First, it was Wasserstein GAN, and now, it's LSGAN's turn.

CoGAN: Learning joint distribution with GAN

Original GAN and Conditional GAN are for learning marginal and conditional distribution of data respectively. But how can we extend them to learn joint distribution instead?

Wasserstein GAN implementation in TensorFlow and Pytorch

Wasserstein GAN comes with promise to stabilize GAN training and abolish mode collapse problem in GAN.

InfoGAN: unsupervised conditional GAN in TensorFlow and Pytorch

Adding Mutual Information regularization to a GAN turns out gives us a very nice effect: learning data representation and its properties in unsupervised manner.

Maximizing likelihood is equivalent to minimizing KL-Divergence

We will show that doing MLE is equivalent to minimizing the KL-Divergence between the estimator and the true distribution.

Variational Autoencoder (VAE) in Pytorch

With all of those bells and whistles surrounding Pytorch, let's implement Variational Autoencoder (VAE) using it.