The whole software infrastructure was a scary house of cards. They were afraid that there was unknown code that might be depending on it. For example RESTful services in other departments that were not under our immediate control.
Yes, it was exactly like that- lots of code had grown around the "bug", and it was not immediately obvious what other software had come to depend upon it. "Little hairs", as Joel might say.
Obviously you have never worked on a code base with 1000s of developers. If you edit almost every file then basically everyone needs to stop writing new code while the change is made. Otherwise the merges others have to do is going to be a disaster.
Well, then problems with such process are pretty well documented[1]. For comparison, at Google global refactorings are pretty common and usually painless, there are even custom tools to support such changes (push them through code review, ensure no tests are broken etc.)
I know the Microsoft process all too painfully. RI,FI,RI,FI,RI,FI,RI,RC,RTM,Ship It,Repeat.
But as to the "usually painless" at Google. So when does that pain happen?
Can you take me through the following scenarios: change a variable name, change a base class name that lots of people extend from, file renames?
How do you go about refactoring? Do everything at once? Breaking it into pieces? Do file rename then variable and base class renames? Or smallest piece at a time?
Once the refactoring is complete how do you communicate to others the changes so when they merge the code in they don't get too messed up? Or worse undo something in the refactoring. (Also follow up is it better to do the big refactoring so there is the one big merge or a bunch of little refactorings and lots of little merges across the spectrum).
I guess the code change isn't the problem. It's making a big change and getting people on the same page is much harder. Especially when their are varying degrees of skill and experience on a project. And it's this stuff that is painful and leads to not wanting to do big refactorings at a lot of shops.
Hacker News has a short attention span and this probably won't be seen by many, but I'll try to answer your question nevertheless. There are several factors I'd like to mention.
1. Most importantly, the version control head is always the point of reference, and the burden of merging is on people who keep long-lived pending changes. This means that conflicts are resolved as soon as possible by a person who actually knows the context, instead of being postponed until a dreaded merge window. Ultimately, a programmer pursuing refactoring is only responsible for making sure it works on the head, and should announce the change so that others are prepared for merging.
2. There are some huge code bases at Google, but nowhere near the size of Windows. On the other hand, I'm sure that even Windows has to be separated into more or less decoupled components. When I doubted that you work on the same code with thousands of other programmers I was thinking in terms of components, not final products.
3. Cultural aspect shouldn't be disregarded. Code hygiene is encouraged at Google, and some people volunteering their 20% time to help with that. Moreover, there are some custom tools that make global refactorings much easier and safer.
EDIT: Eh, apologies if this sounded like chastising -- I didn't mean to. As a developer who's been trapped in "Windows-mindset" for many years, I wanted to try to inspire other Windows devs to try to use *nix-based solutions even if their only option is Windows development. Cygwin is in a very good spot right now -- it's achieved so much acceptance that even the most hardened institutions now allow it to be installed.