As I was reading this article, a similar thought occurred to me: "I wonder if that's better or worse than a human?" Unfortunately, there was no human baseline in this study. That said, there are studies that compare LLM to human performance. Usually, humans perform much better (like 5-7x better) at long-running tasks.
In other words, a human would probably do better than an LLM on this task.
Humans lose to LLMs in narrow, well-specified text/symbolic reasoning tasks where the model can exploit breadth, speed, and search. Usually, the LLM performed ~15% better than humans, but I saw studies that were as high as 80%. To my surprise, these studies were usually about "soft skills" like creativity and persuasion.
At some point I will (the story is pretty crazy) but for now I’d like to keep certain details of my life private, so I avoid blogging. I used to think I’d need a blog to attract consulting clients, but I’ve had no problem without one so far, and thus I haven’t gotten around to it. My personal website is basically Lorem Ipsum lol.
If you’re really interested, shoot me an email. I’m happy to talk one on one.
> Given that you've at minimum doubled the code (and doubled the bugs), it seems like a really bad long-term trade off.
I'd say you've at maximum doubled the code. The test ensures you write only what you need to get the test pass. Without them, devs get distracted and wander until the feature works. Usually distracting themselves with tons of YAGNI violations along the way. In my experience, untested code bases have a ton of unnecessary code.
I don't understand how it would double the bugs. The article has references saying it reduces them. But, even thinking about it, I don't see why you'd say that.
Maybe it's only difficult to test end-to-end? I would assume there's code in there that's algorithm-like. Give it these inputs, it should return these outputs. If so, it should be possible to isolate that code and test it.
But, I dunno, I haven't looked at the source code. It might be very difficult to maintain.
> Why does this matter? It matters for the exact same reason why memory leaks are bad in general: They consume more memory than necessary.
Thing is, this doesn't usually matter. I have never gotten an out of memory error from a leak in Java. Now compare that to all the development time I've saved by not having to deal with pointer arithmetic. I consider it a huge win. It's all about the type of apps you're making.
When I play Minecraft on my notebook, I first shut down all nonessential system services (I have a handy line in the shell history for that). This allows me to get around 45 minutes out of Minecraft (instead of 30 minutes) before it gets struck by the OOM killer.
Hmm, weird. I've never had Minecraft killed for OOM, since starting to play in the alpha days. In fact, I can watch the memory pool slowly grow, then a drop in framerate and increase in free memory when the GC is triggered every 30 seconds or so.
I don't do anything special to play; my sessions used to run into multiple hours, during my peak years of play.
I suspect that it's not fair to blame Java for whatever problem you were having, though.
I've used tons of programs that have problems and crash for various reasons. Is this an argument against the language? I don't think there'd be any left to use with this line of thinking.
Besides, there's too many variables in your anecdote. Is it a laptop from 1995? Is the OOM from a bug that could/should be fixed?
The notebook is from 2012 and has 4GB RAM. Minecraft stands out because most other programs of similar complexity (e.g. Portal 2) work fine.
I've heard a story that Minecraft's RAM consumption got a lot worse after Notch stepped down. The new developers refactored the code for OOP best practices (such as passing a 3D coordinate as an object rather than "int x, int y, int z"), which tremendously increased the number of allocations and thus GC pressure and memory usage. So it's fair IMO to blame this to a language. Having good practices lead to such consequences is terrible design.
Just because IntelliJ is using 2GB of RAM doesn't mean they have a memory leak. As long as that 2GB profile is stable, it isn't going to bring down the system.
> Place your test files next to the implementation.
Java background is probably biasing me, but I don't like this. It interferes with my ability to find code because the file list in every directory is twice as large.
If you are creating a separate directory for each "module" then it would just be one more file in each directory.
Component
- index.js
- Component.js
- component.scss
- Component.spec.js
This structure has worked really well for me, you have everything you need in a single directory so you don't have to jump around to another directory to find the test or styling or whatever.
Putting them together is AMAZING. So when I used to do Java development I thought the organization was great but every time I had to find a unit test I had to dig through a folder structure.
Then I tried GO. GO puts them together. I was amazed at how such a simple concept could keep my code so much better organized. I could immediately open both the code and the unit tests for said code. Just don't put a ton of files in any one directory (if you have a lot of files in a directory you're actively coding it I'd argue your file structure is not optimal).
Now when I do JavaScript development I have adopted 3 extensions:
- .aspec.js is for unit tests that should work in all environments
- .nspec.js is for unit tests that only work in node.js
- .cspec.js is for unit tests that only work on a client like a web browser
I think it probably depends on what your definition of "unit" is. As I stress functional units that encapsulate logic and data model but don't permute state, my units are probably significantly larger (and simpler) than those of people writing more imperative/traditional-OO units (where something under TDD may encapsulate many units due to mounting complexity/complication and so require further decomposition).
Right. In that, you still write all of the tests before the code. If you are objecting only to the rhetoric of "wall" of tests. That just depends on your size of units. Think of it more as hurdles of tests. :)
Everyone is interpreting this as, "write 10 tests then try to get them all to pass at once". That is not how you TDD. You write one, then get it to pass, then write another.
Maybe you mean, "write the test before the code", but when you say "write all tests before the code", it's not interpreted the same way.
reply