Y
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
points by
Tepix
3 hours ago
|
hide
|
0 comments
Sounds good. I saw that you use the FP8 version of the model. Do you also quantize the KV cache?
add comment