Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the reasons I got out of network engineering was how frequently the work I was required to do would cause unintended consequences. You can do all your due diligence, get your work blessed by vendor support, and still get blown up by a bug or undocumented behaviors on a regular basis. The conspiratorial part of my brain says these network device makers intentionally provide unreliable software and terrible documentation to bolster their support contract profits. I was just the guy typing in the commands and getting all the blame.


I remember the first time I got access to an employers production Cisco router. It’s pretty scary how easy it is to majorly fuck something up.

There isn’t a concept of a transaction or a rollback. You just enter a command, press enter and it’s live.

To counter this we’d write all the commands we planned on executing and peer review it. Nothing was to be done “on the fly” (at least in theory)

In short, coming from a developer perspective with ample version controls and gated releases… networking is a very wild ride.


> There isn’t a concept of a transaction or a rollback.

Yeah, Cisco gear is bonkers.

Mikrotik has "Safe Mode", which undoes all commands since you entered "Safe Mode" if the connection that created the shell gets interrupted. It has saved my bacon on several occasions, but there are several obvious situations in which you can get yourself locked out.

Juniper gear has "commit confirmed $NUMBER_OF_MINUTES", which will roll back everything since your last commit if you don't do a "commit" within $NUMBER_OF_MINUTES. It will also, apply all of the changes you've staged all at once (and do configuration sanity checking before it performs the commit).

I do have no idea how Juniper's rollback works when multiple users are doing simultaneous config editing... maybe don't do that?


> I do have no idea how Juniper's rollback works when multiple users are doing simultaneous config editing... maybe don't do that?

You get a warning

    Users currently editing the configuration:
      bob termainal p0...."
But the failure here is actually sshing to a network switch in the first place.

Some cisco kit has restconf which is better for automation, but it's buggy.


Modern router operating systems have this.

It’s been a long time since I’ve touched IOS-XE (Cisco enterprise gear) but Cisco IOS-XR, Junos, Arista EOS and the Nokia SRs all support some combination of configuration transactions with rollback and commit confirm on a timer

This definitely doesn’t stop you shooting yourself in the foot, similar to how you can still push broken config to a k8s controller, but it’s some level of protection for certain types of changes.


>"There isn’t a concept of a transaction or a rollback. You just enter a command, press enter and it’s live."

This hasn't been true for a very long time. Juniper router's have rollbacks, commits and revisions:

https://www.juniper.net/documentation/us/en/software/junos/c...

and

https://www.juniper.net/documentation/us/en/software/junos/c...

Cisco has similar:

https://www.cisco.com/c/en/us/td/docs/ios/ios_xe/fundamental...


Except Cisco doesn’t have a commit feature in any of their OS and the rollback feature is not implemented everywhere as well - NXOs doesn’t have it for example. Still, it’s better than ‘reload in 5’ that we had to use back then.


That's not entirely true, you can rollback a change on modern switches/routers, either via a rollback command, or with a revert timer (configure terminal revert timer X) (because the new configuration might have made the router unreachable, so you're never sure you'll be able to rollback manually if you're working remotely).


Interesting. There's also some stuff in Cisco that can't be done both atomically and remotely, so you may have to push a change as a file to the router and then source the file into the running config with some permutation of `copy`.


Hadn't thought about it from the perspective of support contract profits, but they also have their friendship stick firmly planted in technicians via the semi-required training since as you indicate the manuals are deficient.

At some point network vendors switched manuals from engineers documenting features whitebox to educated techs documenting features blackbox.

There's a clear transition for docs produced after 2008, prior to which more care went into tech notes and interpreting technologies -- after you're lucky to even get a complete set of steps and caveats without having to cross-reference bugs, release notes, old-manuals, new-manuals, draft manuals, reference manuals, licensing manuals, the inevitable errors that appear in the logs, and of course the configuration guide where this should all be in the first place.

In short, yes, this.


> The conspiratorial part of my brain says these network device makers intentionally provide unreliable software and terrible documentation to bolster their support contract profits.

As a dev who has worked at one of the major networking vendors, I can assure you that is the not the case. You’d be surprised by how major bugs are handled internally, especially if the bug affects “important” customers.


Networking and storage changes are always butt clenching affairs. Way more stressful than anything else in IT due to their blast radius if something shits the bed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: