It correlates to performance, speed to iterate, security, and design complexity,...

8note · on Jan 29, 2022

I'm unclear that it correlates to iteration speed or design complexity.

Actually, performance too.

dralley · on Jan 29, 2022

More compact code fits in caches better.

Zababa · on Jan 29, 2022

Any hard data on any of that?

goodpoint · on Jan 29, 2022

Unless a fat binary embeds pictures and some music, it's all CPU instructions.

Tenths of megabytes of CPU instructions is complexity.

This kind of bloat is the number one enemy of security, as any security engineer could confirm.

0xbadcafebee · on Jan 29, 2022

Sure, lots, go look for studies on estimation of defects based on LOC and project size/complexity (they go back to the 1970s). But you don't need to look, the principles are simple.

Unless an application is filled with JPEGs or uncompressed arbitrary data files, its size reflects lines of code (machine code, interpreted code, etc). Bigger the app, the more lines of code.

Every line of code has a non-zero bug probability. Every new line of code increases probability. More lines of code, higher probability. Bugs include security bugs; higher probability of bugs, higher probability of security bugs.

CPU cache is finite. Only so many lines of code can be cached or optimized. Larger size takes up more room in memory, which when combined with lot of other gigantic apps, means less memory for heap space, disk cache, etc. Larger size also takes up more room on disk, which adds up when you don't delete old builds on disk and loop over a build process. Since larger size means more lines of code, that means longer compile times, which means longer wait every time you change a line and need to recompile, copy an artifact somewhere, retest.

More lines of code means more code executed. If you have 10 lines of code in a function, and you add 100 lines to it, the compiler doesn't just optimize away all 100 new lines, it's going to add more machine code and code paths. Unless you only ever add new code paths, some of that new code will extend existing code paths or add instructions, and that means more CPU cycles to complete execution. (Same concept for interpreted code)

More lines of code means more code paths. More code paths increases complexity. The more code paths, the longer and more difficult testing gets to the point you can't even develop enough tests to cover all the code paths, so it's impossible to even find all the bugs. More complexity leads to difficulty in humans understanding and working with the codebase, and difficulty in understanding leads to slower and more error-prone development.

Larger means more network bandwidth, meaning file transfers take longer, increasing speed to iterate and producing worse UX. If people download your app every 10 minutes in their CI/CD pipeline, larger size means more network bandwidth used. "Free" CDNs have limits; the larger a project gets, the more file size affects network performance, reliability, and cost. If you pay for bandwidth, a 100MB file costs 100x more than a 1MB file.

The more apps you use that are big, the more every one of these effects increase. One big app you might not notice. 100 big apps lead to noticeable slowness, bugs, less memory, less disk space.