Do You Really Need a Staging Environment?
"Works on my machine" might not be the devil, after all
A Staging environment is one of those things in software development that we take for granted. Of course you need a reliable, production-like environment where changes can be pushed to and safely tested without affecting live customers in production.
Here’s the problem though. A “reliable, production-like environment” is fantastically hard to achieve. And that’s not even the worst part. If you get there, it only takes a couple of semi-major changes pushed to it, and the rot begins. Things start to diverge from production in ways that make it hard to predict whether what you’re testing will actually work in production.
“Oh, but we have a rock-solid automated process that keeps everything in sync”, is sometimes the response to this point.
This may be true, but I still have to ask: how expensive (in terms of time and money) is it to set up such a process? And is it even warranted within the context of a startup?
There’s no getting around it. A perfect staging environment (even if it exists):
Is expensive to maintain
Slows down continuous deployment
Encourages big bang releases
I say “even if it exists”, because I don’t take it for granted that it can be achieved. There are, for example, industries where -for regulatory or other reasons- there’s no way to replicate the exact experience of a production workload. You just have to do that final test in production if you want to be absolutely sure.
What happens when you challenge the received wisdom of Staging environments? What’s an even more effective way of testing updates without negatively affecting live customers?
Here’s a few ideas:
1. Embrace trunk-based development
Stop creating big branches, working on them for days on end, and then finally making a big bang release with tons of changes (and things that could go wrong).
Instead, break the task at hand into as many small, discrete sub-tasks as possible. Work on each one at a time. Ideally, each of these sub-tasks should be doable within a working day or less. When you’re done, push your changes directly to the main/master branch (i.e. the trunk).
This has the benefit of making each change more testable and easier to revert if something goes wrong.
2. Use feature flags
Feature flags don’t need to be complicated. There are many cool platforms out there that will help you with feature flags at scale.
But to begin with, it can be as simple as adding some simple conditions in your code:
if (condition) {
newBehavior();
} else {
oldBehavior();
}The vast majority of customer-facing changes can be safely tested directly in production under a feature flag.
3. Use a CI/CD pipeline
A robust CI/CD pipeline automates testing and deployment, ensuring that each small change is validated quickly and safely before reaching production. This continuous validation reduces the need for a dedicated staging environment by integrating quality checks throughout the development lifecycle.
(bonus) 4. Experiment with Chaos Engineering
The most famous example of Chaos Engineering comes from Netflix, where they introduced Chaos Monkey to their Kubernetes cluster to simulate failures.
It’s a powerful, simple idea. But it’s SCARY to most people. It means building your system so that it can withstand being partially broken. And then, you proceed to actually break some of its parts, to ensure that your resilience assumptions stand up to reality.
In conclusion
Challenging the need for a staging environment isn’t just about saving time and money. It’s about embracing a more dynamic and efficient way of building software. By focusing on small, incremental changes, using feature flags to de-risk deployments, and leveraging a solid CI/CD pipeline, you can ship features faster and with more confidence.
So, before you spin up another complex, production-like environment, ask yourself: do you really need it? Or could you be moving faster and safer without it?

