Just because I get asked this question pretty often when talking about vaes with flexible base distributions and figured i'd make a simple example. If we think of diffusion models as hierachical VAE ...