Y
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
points by
angarrido
1 day ago
|
hide
|
0 comments
must people think it’s just GPU cost. In practice it’s coordination: model latency variance + queueing + retries under load. You don’t scale linearly, you get cascading slowdowns.
add comment