> "All programs were launched using the release mode if available. Other options were left default."
So it's likely these numbers simply reflect choices in the default configuration, not the ultimate limit with tuning.
I'm starting to dislike these low effort ChatGPT articles. Getting an answer from it is not the same thing as actually learning what's going on, but readers will walk away with the answers uncritically.
Lately I’m seeing a lot of this baked-in assumption that ChatGPT writes good/idiomatic/performant code, as if the code it generates is ~The Answer~. I guess it’s the modern version of copy/pasting from Stack Overflow. The code I’ve had it write isn’t universally terrible but it almost always has some low-hanging performance fruit (in JS for instance, it seems to love spreading arrays). I’m not sure how much I trust it to write good benchmark code!
Adding to the others here, yeah - it has to loop over the whole array, and it allocates a new one. The allocation itself costs something and it also creates work for the garbage collector (freeing the old one). So it’s a lot of work compared to a .push()! Sometimes you do need the immutability aspect though, because a lot of things use referential equality for change detection (React is a big one).
Spreading allocates a new array. Most things you can do with spread you can do with normal functions that either just directly access an entry or mutate an array directly. [first, …rest] is nice, but will always be slower than direct access.
Spreading an array requires looping over all the array elements which is O(n) compared to something like Array.push() which is just 0(1), although spreading has other benefits like copying rather than mutating
I've been seeing a lot more "ChatGPT only produces junk code" comments (case in point, this entire thread), and that hasn't really been the case either.
For majority of those benchmarks it produced the right code from the first go. It struggled more with Java VirtualThreads - probably fewer programs in the training set. It also had a tendency to overcomplicate things (adding unncecessary code). So there were a few iterations needed plus some hand edits.
To play Devil’s Advocate, this is a good test of “how many concurrent tasks will an uncritical developer relying on ChatGPT end up achieving with low effort”.
Right, realistically your product will not primarily be used by the most talented experts, who are optimising as best as possible given intimate knowledge of internals. People will do what basically works, and it's arguably doing a disservice when "benchmarks" avoid showing us also naive performance without the tuning. The tuned performance is worth knowing, but realistically most customers will get the naive performance, ain't nobody got time to learn to tune everything.
I think this applies all the way down to hardware features and programming languages. For example the naive use of sort() in C++ will go faster than in Rust (in C++ that's an unstable sort in Rust it's stable) but may astonish naive programmers (if you don't know what an unstable sort is, or didn't realise that's what the C++ function does). Or opposite example, Rust's Vec::reserve is actively good to call in your code that adds a bunch of things to a Vec, it never destroys amortized growth, but C++ std::vector reserve does destroy amortized growth so you should avoid calling it unless you know the final size of the std::vector.
That is a fair point. I don’t think it saves this particular analysis, of course. Probably what’s important to know is something we might call the “Pareto naive performance”; with just a little bit of effort (the first 20% or so), you can get some significant performance improvement (the first 80% or so). Even the naive programmer just banging together something that works in Elixir is going to quickly find out how to raise Erlang’s default 260k process limit when they hit that, after all.
$ /usr/bin/time -v elixir --erl "-P 10000000" main.exs 1000000
08:42:56.594 [error] Too many processes
** (SystemLimitError) a system limit has been reached
“This limit can be adjusted with the `+P` flag when starting the BEAM… To adjust the limit, you could start the Elixir application using a command like the following:
elixir --erl "+P 1000000"
“If you are getting this error not because of the BEAM limit, but rather because of your operating system limit (like the limit on the number of open files or the number of child processes), you will need to look into how to increase these limits on your specific operating system.”
Yes, again ChatGPT was better than just random search. If writing this blog post teaches me something, then it teaches me to use ChatGPT more not less ;)
BTW, I fixed the 1M benchmark and Elixir is included now.
I’m glad you saw the irony in my reply. I was a little worried it would come off acerbic, like “since you use ChatGPT, you only deserve ChatGPT”. The real intent was double-edged, of course - to demonstrate that ChatGPT isn’t just dumb-codemonkey-as-a-service, it can also quite easily solve a lot of the problems it is purported to create. You have been handling mean comments here (including my own mean comments) with grace, so I took the risk.
> Even the naive programmer just banging together something that works in Elixir is going to quickly find out how to raise Erlang’s default 260k process limit when they hit that, after all.
Sure. Will redo this part. I doubt it would allow Elixir to get a better result in the benchmark, though, as it was already losing significantly at 100k tasks. Any hints on how to decrease Elixir memory usage? Many people criticize using defaults in the comments, but don't suggest any certain settings that would improve the results. And a part of blogging experience is too learn things - also from the author perspective ;)
Curious what build configuration changes would help with this. Specifically with python/Go because that’s what I’m most familiar with, but any insight into tweaking build configs to increase performance here would be really interesting to me.
So it's likely these numbers simply reflect choices in the default configuration, not the ultimate limit with tuning.
I'm starting to dislike these low effort ChatGPT articles. Getting an answer from it is not the same thing as actually learning what's going on, but readers will walk away with the answers uncritically.