Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
I ran Opus 4.7 vs. Old Opus 4.6 vs. New Opus 4.6 on 28 Zod tasks
(
stet.sh
)
2 points
by
bisonbear
10 days ago
|
past
|
discuss
Coding evals are broken. CI is green while AI code quality goes unmeasured
(
stet.sh
)
1 point
by
bisonbear
12 days ago
|
past
|
discuss
Agents.md is the highest-leverage code you're not testing
(
stet.sh
)
1 point
by
bisonbear
17 days ago
|
past
Your AI coding benchmark is hiding a 2x quality gap
(
stet.sh
)
3 points
by
bisonbear
45 days ago
|
past
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: