Skip to main content
  1. Posts/

Principles for SOLID Systems

Table of Contents

We are the lessons we’ve learned along the way, and we try to apply them at different levels of our experience.

One of my favorite lessons as a software developer was the SOLID principles from object-oriented programming. They were never canonical rules about how to write code, but they made me a better Rubyist, not because they told me what to do, but because they made me think why I was doing it.

Today, I find myself thinking in the same terms, but at a deeper layer of the stack: the operating system.

Some of the original SOLID names have been adapted to fit this context, and, just like before, these are not laws, only principles I personally find valuable when building robust, secure, and reproducible systems.

SRP - Single Responsibility Principle
#

A Single-Purpose Operating System (SPOS) has one responsibility, in contrast to general-purpose OSes like the one I’m using to write this post, which I also use for software development, video calls, entertainment, online shopping, and so on.

A SPOS is more like your ISP’s router or a smart appliance. Historically, these systems only existed in embedded environments, but we no longer have to limit ourselves to that model thanks to the advancements in containerization.

To achieve this, we try to slim down the image to only that which is necessary, the benefits: the attack surface shrinks, audits become easeir, and updates risk less unintended side effects.

Smell: You maintain a single generic image that gets heavily configured at runtime for different roles. Instead, build slim, purpose-built images or extend them cleanly (see OCP).

OCP - Open for Extension, Closed for Modification
#

Once your base system has a single responsibility, it should remain closed for modification but open for extension.

Avoid touching the system definition and instead, extend the system through mechanisms like system extensions, bundles, or other. Ideally pick those that can be verified or measured.

Smell: Two machines serve the same purpose but require different firmware, so you bake all firmware into the base image. Instead, keep the base sealed and extend only the firmware required for that specific system.

LSP - Lifecycle Stability Principle
#

Every SOLID system should maintain a stable lifecycle — creation, deployment, upgrade, and rollback should all be predictable and reversible. By no means, should the system fail to boot after a lifecycle operation, or if it does it should automatically try the last running system.

In addition to those, there should be a factory reset mechanism to restore the base image and clearing runtime state without re-provisioning the machine from scratch. This operation is not reversible by design. And a recovery mode to debug, but keep in mind that this last one should be used like an airplane’s black box, it’s something that you know is there just for the very worst case scenarios.

When those guarantees hold, you can trust the system even in an edge location where you cannot easily go in and investigate the state of the system.

Smell: You constantly find yourself accessing the recovery mode of the system, because upgrades leave the system in a broken sate.

ISP - Immutable System Principle
#

To achieve LSP, provisioning and upgrading need to be atomic operations, either they are done or not, there should not be an inbetween state. This is hard to achieve with GPOS where the system can fail finishig upgrading a list of packages and yet still boots. While it might sound like a positive thing, it is not because all the sudden you find yourself in an unknown state. Whether your applications work with this set of packages is a matter of luck and not something you have previously tested. You are likekly to have this kind of problem even if you’re using a configuration management sytems.

Talking about CMS, we also need to address the elephant in the room. While these systems are meant to follow a recipe that will leave every system looking the same, they have a design flaw, they depend on the upstream source not changing, which we know it’s not the case. In practical terms, this means that having defined that you want a certain package present in your system is not SOLID because if you run this process at different moments you might get different results. Even if you pin the version number, the dependencies could change at the source. The worst part of it is how they have been sold as the final solution for managing “cattle” leaving companies thinking they are secure and auditable, while these are just smoke and mirrors.

The only way out of this rat race is immutability.

Smell: After provisioning a fleet, you find that at least one of the machines differes even if just slightly from the others

DIP - Dependency Inversion Principle
#

The immutable core should depend on stable abstractions, not on external, changing details.
A change at the outer layer, say, configuration, shouldn’t require rebuilding the entire image.

Immutability isn’t about freezing everything, it’s about controlling where change is allowed to happen. When a system’s base layer is immutable, changes are isolated to clearly defined areas (data partitions, overlays, or extensions). Overlay partitions can even reset drift automatically—any unauthorized change disappears on reboot because the base image is reapplied. And because updates land through fresh images, frequent reboots become an asset: there is no prize for the longest-running, unpatched node when every restart brings you back to a known-good state. This separation keeps the core reproducible and secure while still allowing flexibility where it’s needed.

The system layout should reflect that principle: rebuilds only for base components changes (kernel, security), while higher-level ones happen through configuration or declarative overlays.

Smell: You tweak a config file and suddenly need to rebuild the whole OS image. That’s an inversion failure, the dependency direction is wrong.

Conclusion
#

By reinterpreting SOLID through this lens, we can design systems that are stable at their core yet flexible at their edges, systems we can automate through our pipelines just like we already do with code, and that remain robust enough to trust in production, whether in a datacenter or at the edge.

To put the principles into practice:

  • Start by isolating the single purpose of each image, even if it means producing more artifacts.
  • Lock the base layer, then expose controlled extension points for firmware, drivers, or workloads.
  • Make lifecycle guarantees explicit: document the upgrade path, rollback trigger, and recovery tool.
  • Keep mutability quarantined to known surfaces so audits and drift detection stay tractable.
  • Route environment-specific details through declarative inputs instead of image rebuilds.

Just like SOLID once made our codebases easier to maintain, these principles can do the same for our systems, helping us build a world of software and infrastructure that’s truly SOLID—in name and execution.

Reply by Email