What?

Enabling reparameterisation trick to work for discrete random variables.

Why?

We want low-variance gradient estimate methods (reparameterisation) to work with discrete random variables.

How?

source: original paper

source: original paper

And?


This note is a part of my paper notes series. You can find more here or on Twitter. I also have a blog.