Y
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
points by
Balinares
8 hours ago
|
hide
|
0 comments
Isn't that exactly how draft models speed up inference, though? Validating a batch of tokens is significantly faster than generating them.
add comment