Sounds good. I saw that you use the FP8 version of the model. Do you also quantize the KV cache?