Hacker Newsnew | past | comments | ask | show | jobs | submit | uticus's commentslogin

> Excel has this legacy (but extremely powerful) core with very few people left that knows all of it.

Would love to hear more about this. Especially history and comparison to Lotus etc.


So the first thing that's important to understand is that Excel is the product of another era. One where resources like memory were very constrained and compilers and optimizers weren't as good as they are today (so much so that the Excel team at one point wrote their own because MSVC sucked, but I digress)...

And so, a lot of the core code is used to that. Cell formatting data for example is super tightly packed in deeply nested unions to ensure that as little memory is used to store that info. If something only needs 3 bits, it'll only use 3 bits. The calc engine compiles all of your formulas to its own (IIRC variable-instruction-width) bytecode to make sure that huge spreadsheets can still fit in memory and calc fast.

And a lot of it still carries the same coding and naming practices that it started with in the 80s: Hungarian notation, terse variable and file names, etc. Now, IMO, Hungarian notation by itself is pretty harmless (and even maybe useful in absence of an IDE), but it seemed to encourage programmers to eschew any form of useful information from variable names, requiring you to have more context to understand "why" something is happening. Like, cool, I have a pszxoper now (pointer to zero terminated string of XOper), but why?

So the code is tight, has a lot of optimization baked in and assumes you know a lot about what's happening already.

But more importantly, a lot of "why" information also just lives in people's head. Yes, some teams would have documentation in an ungodly web of OneNote notebooks, or spread across SharePoint pages, which had the least useful search functionality I've ever witnessed, but finding anything you wanted was hard. But that didn't use to matter, since the core team had been there for a long time, so you could ask them question.

That being said, I joined MSFT in 2012 and started working on Excel closer to 2014. At that point, heavyweight like DuaneC (who wrote like 10% of Excel and I don't think I'm exaggerating) had already retired and while others people were very knowledgeable in some areas, nobody seemed to have a good cross view of the whole thing.

You have to understand that I was in the Office Extensibility team. We were building APIs for the whole thing. I had to touch the calc system, the cells and range, the formatting, tables, charts and images (the whole shared oart system was interesting), etc. Answering "How do I do X" was always a quest because you would usually:

- Find 3 different ways of achieving it

- One of them was definitely the wrong way and could make the app crash in some situations (or leak memory)

- All the people on the "blame" had left

- One of them was via the VBA layer which did some weird stuff (good ol' pbobj)

- Be grateful that this wasn't Word because their codebase was much worse

And so, a lot of the API implementation was trial and error and hunting down someone who understood the data structures. The fact that full sync and rebuild took about 6 hours (you ran a command called `ohome` and then you went home) meant that experimenting was sometimes slow (at least incremental builds were OK fast). The only lifeline back then was this tool called ReSearch2 that allowed you to search the codebase efficiently.

But the thing is, once you got thing to work, they worked really well. The core was solid and performant. Just slightly inscrutable at time and not the kind of code you're use to reading outside of Excel.


> Trace scheduling replaces block-by-block compaction of code with the compaction of long streams of code, possibly thousands of instructions long...

"Enormously Longinstruction Words" is very interesting, but for me the spotlight here is on "trace scheduling".


> Interest per Second - General - The U.S. pays $31,688/second in debt interest — $1,901,285 every minute. Your share of that: $3893.33 [plugged in "normal" amount], gone before it bought anything.

...I thought I was already sufficiently terrified by the debt numbers...


Love it - great application of publicly-available data. Also ref https://news.ycombinator.com/item?id=47420307 US govt provided public debt resources.

By the way, the 1040 instructions have a pie chart like this (ref https://www.irs.gov/pub/irs-pdf/i1040gi.pdf, page 122). Not that most people do taxes themselves, or have a reason to read to page 122 of instructions for a single form. But still it's there and perhaps a nice gesture by the IRS.

Breaking it out into pie charts etc like this can be really helpful. In my view the real kicker with taxes is the opaqueness. Kinda like a meal card versus paying for every meal, or like using a credit card versus paying with cash, it's hard for humans to really grasp what's going on unless they're involved.

Of course it would be impractical to pay taxes separately to every waiting hand in government bureaucracy. But on the other hand maybe the number one goal shouldn't be ease of use, either. Maybe a little friction when paying for public services could be a good thing for citizens who are interested in a healthy country - my opinion.


Very painful to see Corporate income taxes @ 8%, vs. Personal income taxes @ 36%. on the income pie chart^

However, not at all surprised. That stat would arguably make the most material difference to voters, if only they knew about it.

Edit: Adds another degree of pain when you consider that the CEO-to-worker pay ratio reached 281-to-1 in 2024.


Why should it be painful that all personal federal incomes taxes produce 4x the income as the federal taxes on all corporations?

And I thought this was a reference to a Win95 problem https://www.slashgear.com/1414245/jennifer-aniston-matthew-p...

Yeah Block level dedupe has been an industry standard for decades. Tracking file hashes? Why?

And I see above that this is a self-hosted platform and I still don’t get it. I was running terabytes of ZFS with dedupe=on on cheap supermicro gear in 2012


File hashes are great to get two systems to work together to dedupe themselves. I have a Windows backup that sends hashes to a backup server, so we don't back up crud we already have.

> You could always write code to help doing those things direct from disk, but you know what you have just written if you do so? A database!

Yes, but that's my point. Why is this not a part of the standard library / typical package with very little friction with the rest of the code, instead of a separate program that the standard library / typical packages provide in an attempt to reduce the friction?

Or are you making the general point that databases already existed prior to the standard libraries etc, and this is just a case of interfacing with an existing technology instead of rebuilding from scratch?


Because a reasonably well optimized database with support for indexes, data integrity enforcement, transactions, and all the other important things we expect from a good (relational) database is complex enough that it takes a rather large codebase to do it reasonably well. It’s not something you slap together out of a handful of function calls.

ETA: look at SQLite for an example — it’s a relatively recent and simple entrant in the field and the closest you’ll find in the mainstream to a purely filesystem based RDBMS. How would you provide a stdlib that would let you implement something like that reasonably simply? What would be the use case for it?


> Importantly, the primary metabolites also block adenosine receptors.

Biochemistry is rarely a one-and-done event it would seem.


how is "tranZPuter" not "transputer" (https://en.wikipedia.org/wiki/Transputer)

and elsewhere on the page, ZPU (FPGA-based microprocessor) sounds a lot like "ZipCPU" FPGA-based microprocessor https://zipcpu.com/


> I’ve noticed a common misconception: spec driven development is a return to a waterfall style of software development. [It] isn’t about pulling designs up-front, it’s about pulling designs up. Making specifications explicit, versioned, living artifacts that the implementation of the software flows from, rather than static artifacts.

This seems like a straw man argument: agile wasn't without specs and waterfall wasn't without some flexibility. What is truly the difference?

> The implementation is then derived from this specification, reflecting iterative changes in the specification, by AI alone or human developers working with AI. Increasingly, these tasks are done autonomously end-to-end by AI agents.

Given that he is heavily involved in Kiro (which touts "specs" as a guided flow within the tool [0]), this is starting to make sense. I read as "take some ideas from Agile and from Waterfall, but set up for purpose of AI assistance."

[0] https://kiro.dev/docs/specs/


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: