I guess the author is trying to simplify, but its way more complex than that. Simply assuming a few layers of cache completely misses all the other layers that have effects starting with.
Cache lines, RAM Read vs write turnaround, dram pages, number of open dram pages, other CPU's interfering with the same RAM channel, remote NUMA nodes, and probably some I'm forgetting. All this is very similar to secondary storage access rules (even for SSDs)...
Sure, but I think the point is that big-O notation fails miserably at analyzing certain algorithms because it doesn't have a way to represent the locality of the algorithm.
Put another way, all the little "constants" thrown away in the analysis may not actually be constants, and their non-constantness may be enforced with actual physics. In other words, like the article says, the idea that storage access times are constant is nonsense. Due to physical limitations, this is insurmountable rather than being a side effect of architecture. So, its quite possible that for certain algorithms the "constant" factors may be overriding terms in the analysis.
Cache lines, RAM Read vs write turnaround, dram pages, number of open dram pages, other CPU's interfering with the same RAM channel, remote NUMA nodes, and probably some I'm forgetting. All this is very similar to secondary storage access rules (even for SSDs)...