I ran a few experiments by adding 0, 1 or 2 "write better code" prompts to aider... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		anotherpaulg on Jan 3, 2025 \| parent \| context \| favorite \| on: Can LLMs write better code if you keep asking them... I ran a few experiments by adding 0, 1 or 2 "write better code" prompts to aider's benchmarking harness. I ran a modified version of aider's polyglot coding benchmark [0] with DeepSeek V3. Here are the results: `\| Number of \| "write better code" Score \| followup prompts --------------------------- 27.6% \| 0 (baseline) 19.6% \| 1 11.1% \| 2` It appears that blindly asking DeepSeek to "write better code" significantly harms its ability to solve the benchmark tasks. It turns working solutions into code that no longer passes the hidden test suite. [0] https://aider.chat/docs/leaderboards/

minimaxir on Jan 3, 2025 | [–]

This is an interesting result but not surprising given that bugs might cause the suite to fail.

layer8 on Jan 3, 2025 | [–]

To be fair, you didn’t specify that the functional requirements should be maintained, you only asked for better code. ;)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact