presumably that happens at training time?

then once successfully trained you get faster inference from just the diffusion model