Hacker News

points by Rohinator 1 day ago | hide | 0 comments

Using Claude as a benchmark for its own quality is pretty funny. If we think the quality has declined, wouldn't that also apply to the benchmarking process itself?

You'd think so, if its quality has gone down, then its ability to know that is also decreased.