Martin Fowler says in his article:
Release flags are a useful technique and lots of teams use them. However, they should be your last choice when you're dealing with putting features into production.
So what’s wrong with feature flags? Among other things:
They give PMs an excuse to not make hard decisions, such as completely removing a feature.
The codebase becomes more complex and harder to maintain.
Testing becomes harder (and lower quality) - figuring out what combination of feature flags needs to be supported.
Before I go on, the usual disclaimer: It's just the opinion of one guy, based on my own experience.
We have a new section in the newsletters the week - the ‘Leading Developers’ job board! Scroll to the end for engineering leadership roles :)
What are Feature Flags / Feature Toggles
It started as a way to separate feature releases from code deployment. You as a developer can push to the main/master/trunk branch unfinished or untested code, without any effect on the customer. Later on, once it’s ready, you can just toggle the setting and make the feature live.
In a very simplified version, it can look like:
if (isNewUIFeatureFlagEnabled) {
return newUIComponent;
} else {
return oldUIComponent;
}
Pete Hodgson wrote an AMAZING technical article about feature flags, on Martin Fowler’s website. Any quote below is from that article.
Feature Flags as a tool for PMs
I’ll use the term flag/toggle interchangeably, as I’m used to saying feature flags but feature toggles is also common.
As we saw, the most simple usage of Feature Flags is as Release Toggles - to “turn on" code that is hidden behind the toggle.
I wish it stayed that simple. Once you introduce the Feature Flags capability, PMs will come up with other ideas for using it:
Why should we rely on developers for the configuration change? Let’s move it to somewhere the PMs can access, and then we’ll be able to do it without bothering anyone.
While we are at it, let’s also make it adjustable per user! That way, the PMs can safely release the feature for a couple of users to gather feedback and test it themselves, and only afterwards release it to everyone.
This is similar to a Canary Release. The difference between them is that a Canary Released feature is exposed to a randomly selected cohort of users while here, the feature is exposed to a specific set of users.
And why limit ourselves to using it for turning features on? Let’s put the most resource-heavy feature behind one, and let the Ops people turn it off in case of an unusual overload.
Oh and we have those premium users, let’s have some features enabled only for them!
One last thing - you can put everything under a feature flag, right? So let’s not risk any changes without it, please hide any bug fix under a flag, in case the fix needs to be reverted.
Thus you find yourself in deep shit, with hundreds of active feature flags.
Those suggestions above are not ALL bad, most of it makes sense. The problem is when it becomes the answer for everything.
What’s wrong with the feature flags addiction
Feature Flags have a tendency to multiply rapidly, particularly when first introduced. They are useful and cheap to create and so often a lot are created. However toggles do come with a carrying cost. Knight Capital Group's $460 million dollar mistake serves as a cautionary tale on what can go wrong when you don't manage your feature flags correctly (amongst other things).
So what’s that ‘carrying cost’?
1. The codebase becomes more complex
Trying to understand a code file with multiple feature flags is almost impossible, particularly for newer developers. You usually don’t have the context of why each feature flag exists and what it represents, so you are left wondering around and making guesses.
Debugging also becomes harder. You get a ticket for some bug, and can’t reproduce it no matter what. After a day, you remember that you should have turned on that old feature flag the user has…
2. You waste time on dead code
Remember the initial goal?
Release Toggles allow incomplete and un-tested codepaths to be shipped to production as latent code which may never be turned on.
You develop a feature that was not released, and your code stays latent forever.
Developers end up supporting that code. I’ve seen a case when a developer spent a day refactoring a code (as part of a bigger refactor) under a feature flag that was never released!
3. Testing becomes harder (and lower quality)
A single feature flag increases the testing burden by X2 - for each change, you need to test the feature both with and without the feature flag (assuming you don’t have 100% e2e test coverage).
Ideally, feature flags are always separate, and in one place - either hide or show a feature. In practice, they often collide. So with multiple flags in the same file, you might have ‘a combinatoric explosion of possible toggle states’.
In reality, there is no need to test every possible combination, but sometimes it’s not that easy to figure out which combinations you SHOULD test.
…So feature flags become a substitute for testing - “Well no worries, if we find any problems, we’ll just turn it off and fix it afterwards”.
Which results in an awful experience for the customers.
4. Hard decisions are not taken
Not everything should be under a feature flag. If you do a bug fix, it’s ok to just test and release it… If it creates problems - that’s why we have source control, it should take a few minutes at most to revert a PR.
Yes, this can happen sometimes:
Same for old features, that it doesn’t make sense to support anymore. Before feature flags, you would just delete them. Now, it’s too easy to turn that feature flag off, and ‘someday’, if nobody complains, you’ll be finally allowed to delete it (I’ll spare you the second Pablo Escobar meme of ‘a developer waiting for permissions to delete a feature’).
How can you deal with it
It can be tempting to lump all feature toggles into the same bucket, but this is a dangerous path. The design forces at play for different categories of toggles are quite different and managing them all in the same way can lead to pain down the road.
First, align your organization with the fact that feature flags have different categories!
Hodgson talks about 4 main ones:
Release toggles
Ops toggles
Experiment toggles
Permissions toggles
Each type of toggle should live for a different period of time, and has different requirements for changing it:
1. Release toggles
The one we started with - allows developers to merge non-ready code to production, turning it on once it’s ready.
Who is responsible for toggling it: Developers. Those types of toggles can exist in a configuration file.
When should you delete it: A week or two after the release of the feature.
2. Ops toggles
Remember the resource-heavy feature we wanted to let our Ops team control?
We might introduce an Ops Toggle when rolling out a new feature which has unclear performance implications so that system operators can disable or degrade that feature quickly in production if needed.
Who is responsible for toggling it: your Ops team - probably through a dedicated tool.
When should you delete it: 1-2 weeks, once you gain confidence in the new feature. In rare cases, you might keep some “Kill Switches”, to stop non-critical functionality when the system is under unusually high load.
3. Experiment toggles
This type of toggle allows you to perform A/B testing. You divide the users into groups, each getting a different experience. While the toggle is active, you collect enough data to make the final decision.
Who is responsible for toggling it: It varies, but mostly PMs.
When should you delete it: a few days/weeks - once you get enough data to make a final decision.
4. Permission toggles
I left the most dangerous for last. This type of toggle allows you to change the features and experiences specific users have on your website. It may be through having “premium” features for paying customers, or “alpha” / “beta” features, for internal users / beta customers.
This type is dangerous because it needs to be very long-lived, sometimes years. Treating it the same way as the other toggles, as something that will be deleted soon, is a critical and common mistake.
Who is responsible for toggling it: PMs
When should you delete it: You will probably not be the one deleting it… Some poor developer will do it in a few years.
Manage different toggles differently
After everyone is aligned on the categories, discuss how each of them should be managed in your company. Remember that as a feature goes through its lifecycle, it can also move between categories!
Initially, we’ll place it under a Release Toggle while it is under development. Then, we might move it to be behind an Experiment Toggle to measure the effect on the revenue. Finally, once we decide to go with it, we might move it behind an Ops Toggle so that we can turn it off when we're under extreme load.
Hodgson suggests:
…From a feature flag management perspective, these transitions absolutely should have an impact. As part of transitioning from a Release Toggle to an Experiment Toggle the way the toggle is configured will change, and likely move to a different area - perhaps into an Admin UI rather than a YAML file in source control.
Product folks will likely now manage the configuration rather than developers. Likewise, the transition from Experiment Toggle to Ops Toggle will mean another change in how the toggle is configured, where that configuration lives, and who manages the configuration.
But even if you manage all of them in the exact same way - it’s important to acknowledge the existence of different types, if only to align expectations between developers and PMs.
When a developer knows that a feature flag is a Permissions Toggle, and will need to stay forever, they will design it very differently from a Release Toggle that should be quickly deleted.
How to make sure feature flags are removed
This is the biggest challenge of working with feature flags. If you manage to keep the number under control, you solved 80% of the problem - and it’s YOUR job to solve it.
The common approach (that rarely works), is adding a ticket to remove the feature flag as soon as you introduce one.
A better approach is to hold yourself accountable for it. From the article:
Some teams put "expiration dates" on their toggles.
Others go as far as creating "time bombs" which will fail a test (or even refuse to start an application!) if a feature flag is still around after its expiration date.
Another option is placing a limit on the number of feature flags a system is allowed to have at any one time. Once that limit is reached, if someone wants to add a new toggle they will first need to do the work to remove an existing flag.
Final Words
Contrary to the clickbait title - I think that feature flags are a GOOD thing. They are just another powerful tool at our disposal, and we need to make sure it’s not overused.
A note on implementing a feature flags system: don’t reinvent the wheel. There is an OpenFeature standard, that became a CNCF incubating project a few months ago.
What I enjoyed reading this week
The most common mistake with feedback in
. Great tips and visualizations by on getting feedback you can actually do something with.How to Master Salary Negotiations in
. shares how he negotiated more than $100K across multiple jobs, with actionable tips you can use.How Netflix builds a culture of excellence | Elizabeth Stone (CTO) in
Lenny's Newsletter. This is the BEST podcast episode I’ve heard in a while!
Leading Developer Job Board
Welcome to the new section! Each week, I’ll share engineering leadership jobs that may be relevant for you.
Great article Anton!
I really enjoyed reading it and learned a bunch!
The funny thing is that we complain that feature flags can become messy and complex, so we create a system to control that mess, which in turn introduces more complexity (Hodgson 4 categories).
Or maybe everything started that way and under the hood of every infrastructure is an orchestrated mess? 🤔
Great article with useful insights. Cleaning up feature flags can be automated. Here is a demo. Get in touch at gitar.co for early access!
https://www.youtube.com/watch?v=QWdwfqF8o_I