Using flags to ease new feature development

Greg Slovacek

Commit early and often. It’s an oft-recited adage. Ideally, every large change can be broken down into some number of smaller changes which can be committed quickly, say, at least once per day (ideally, multiple times). Large, uncommitted changes in one’s personal workspace cause lots of well-known pain.1

However, there are often too many interdependent moving parts that need to be modified and tested together to allow you to commit something working and complete into the trunk on a reasonable time scale. Even when using a sophisticated source control tool like git, dealing with remote branches is cumbersome and can create hellish merge conflicts.

At Asana, we have found that a fairly simple device can go a long way toward alleviating this pain: flags.

Flags at Asana

A flag system allows customization of the application’s overall behavior, the way command-line flags do for unix commands. Like a switch that turns on and off portions of the code, a boolean flag lets you provide a control mechanism to enable or disable new features as you write them. For a complex client-server application like Asana, both server and client code need to know the values of flags for the running session.

There are many benefits from a properly-designed flag system. For Asana, the flag system makes the following things easier:

  • No remote branches. We can commit as often as we like to the main line if the code we’re changing is behind a flag. The feature doesn’t have to be complete—it doesn’t even have to work!—it just has to pass whatever tests we wrote.
  • Staged rollout. We can turn flags on for individual users or companies. We start with ourselves (“dogfooding”), then roll out to a handful of customers, then launch to all users. This is a powerful tool for staged rollouts and lower-risk experiments of new behavior.
  • Instant rollback. If we discover a problem with a recently launched feature, we can turn it off and be confident everything still works.
  • Testing. We run automated tests hourly, once with all flags off and once with them on. This gives unlaunched, unfinished features the same resilience to breakage as the rest of our code.
  • A/B testing. Our flags are a natural fit for this.
  • Collaboration. If a teammate needs our code we can share it, behind a flag, without needing a separate branch. Our designer can enable that flag and see how the feature is progressing, providing early feedback.

Usage

Declaring flags is easy! We just need to include a statement like the following in any one of our source files:

// Boolean flag to enable the attachments feature
FlagSystem.defineBool({
  name: "enable_attachments",
  help: "Show an attachments section in the property sheet.",
  is_launched: false
});

And using the flag in code couldn’t be simpler—it’s as easy as:

if (Flags.enable_attachments) {
   ...
}

Conclusion

Flags aren’t the right fit for every large change. Sweeping changes that touch huge amounts of the code (like a visual redesign) result in too many places to introduce conditionals. Changes involving data migrations must be handled with care.

But in general, flags have been a very useful tool at Asana, enabling us to stay nimble in our development and avoid the many problems with large changes. Sometimes there are little solutions to big problems.

Have any interesting problems you’ve leveraged your flag system to solve? Feel free to leave a comment!

  1. For unfamiliar readers, here are some of the pain points of large changes:

    • Increased liability. Uncommitted code is “invisible.” It creates assumptions about the code that the rest of the team is unaware of. Teammates can break those assumptions with their changes, causing you pain in various forms: merge conflicts, test breakage, naming mismatches, and more subtle defects, all of which further delay the commit and facilitate yet more breakage.
    • Missed opportunities for collaboration. Your teammates cannot help with, review, or leverage any intermediate work you’re doing if they have to wait until it’s all done before seeing it.
    • Snowballing change size. Refactorings and tangential fixes you make during your change will depend on the newer code. You can’t commit these unless you commit everything, so your change snowballs.
    • Difficulty of review. For organizations that employ code reviews, a large review requires someone keep a lot of state in their head, obscures the thinking and motivation behind details of the code, and can be very daunting. A single two-hundred-line review is more difficult than two one-hundred-line reviews, and either productivity or review quality suffers.
    The list goes on and on.
  1. avatarJack Stahl Asana Team Member

    As with all techniques, this is something we’ve adopted from our past experiences at other companies and known best practices. For example, Flickr has a very similar blog post here: http://code.flickr.com/blog/2009/12/02/flipping-out/.

    One thing that’s nice about our Flag system (that it doesn’t look like Flickr’s system had, at least at the time of their blog post) is that we can actually set Flags on a per-company or per-user basis very easily from within the product itself. As Greg alluded to, this makes things like staged rollouts or A/B testing seamless with the development cycle.

    1. avatarGreg Slovacek Asana Team Member

      It is true that flags do leave a little bit of cruft in the system. Once the feature is confidently launched and stable, then we consider removing them. Often times they’re not too intrusive so we just periodically (like, every several months) do a quick sweep of the codebase and clean out any old flags.

  2. avatarDavid E. Weekly

    We’ve been using this technique very effectively at PBworks for nearly half a decade. You do need to periodically revisit unused flags and tear out those codepaths, but it’s a great way to keep everyone on one branch and do “selective rollouts” for early customers.

    1. avatarGreg Slovacek Asana Team Member

      Excellent question, Charlie. This can be a tricky problem, and the solutions are highly dependent on the makeup of your datastore. In general, we solve it by organizing the work such that we can write the necessary migrations first, in a way that’s compatible with old and new code. This may mean two-phase migrations, i.e. migrating first to an intermediate representation that’s compatible with both, then after you’ve fully switched on the flag and everything is stable, migrate to a cleaner / more optimal representation.

      It’s helpful to have a flexible schema, so that for example old code can deal properly with fields it doesn’t care about, either by ignoring them or setting default values for them as appropriate.

      You may also solve this by taking on some temporary additional complexity in the code, so that the code without the flag performs operations in a way that’s compatible with code both with and without the flag. For example, let’s say your app allows users to delete objects in the system, and you achieve this by nulling out all pointers to them so they were inaccessible. And suppose you want to change the app to instead just set a flag on the deleted object and leave the pointers intact, and you’ll check deletion each time you traverse a pointer to an object. This would require a schema change (adding some kind of “deleted” field to the object). Here’s just one possible way you could organize your work:

      1. Run a migration to add the “deleted” field on all deletable objects, which defaults to false and is never set. Ship it!
      2. Change existing code to null out pointers only if a new “enable_deletion” flag is NOT set, but set the “deleted” field in all cases. Ship it!
      3. Add some new code, behind the “enable_deletion” flag, to do handle deleted objects instead of relying on them to be null. Ship it!
      4. Turn on the flag!
      5. At some point, clean up all the code that would be run if the flag is not set.

      This strategy let you test and ship your code at various stages in the process, and covered a migration in the schema and the way the application interpreted that schema.

      Does that help answer your question?

  3. avatarWilliam Johnson

    Why not use a pre-processing parsing system to implement flags. What I mean is instead of having coded if statements, could you define comment style meta-tags that serve the same purpose (similar to pre-processor directives in C and C++)?

  4. avatarWilliam Johnson

    By the way, I do realize that a metadata comment based solution (similar to comment style unit tests in Python) is not as dynamic because it requires you to re-compile for static based languages, or re-invoking the interpreter for scripting languages. Also, Asana would have to create a parsing system on top of a particular language, but looking at the code above it look as if the “FlagSystem” is a library or package that Asana provides; it doesn’t look far fetched to provide an interpreter or compiler hook.

    Realizing that my idea is not as dynamic, with the cost of restart when enabling a flag, it would make for much cleaner, more readable code, more manageable code (by being able to automatically strip code with flag tags) Also, other than automated testing, I assume that flags aren’t enabled/disabled very often.

  5. avatarGreg Slovacek Asana Team Member

    Interesting idea, William. We do like to enable / disable some flags at runtime during unit tests, for example. We could write some of those tests differently such that the code only runs when the flag is set to a desired value, but then we’d have to make sure that our test harness always ran tests with the flags set in that particular way. It seems far clearer, when possible, for the test to just set the value(s) it wants.

    Other benefits of using the runtime itself are that you don’t need a parser (as you point out), and you can freely intermingle flag logic with other application logic without cumbersome syntax.

  6. avatarWilliam Johnson

    Greg,

    Yes, I understand your point. I read the link above on Flickr’s Feature Flags and Feature Flippers. It seems as if Asana’s flags is a combined concept of what Flickr does with Feature Flags and Feature Flippers.

    I do agree with the run-time benefits of your solution, I guess I was thinking more of your solution as separate concepts similar to how flickr makes distinctions. What I mean is that flags can either serve the purpose of rapid development for features or prototypes, or be an actual application feature to turn certain modules/features on/off. I could be wrong, but it seems Asana uses flags for both purposes.

    My idea was solely to address the use of flags (Flickr’s Feature flags) for rapid development of features or prototypes where an automated metadata driven system would drive the delivery or removal of features based on a flag. I was not addressing the dual use of flags, or even the benefits of run-time flags for testing.

    Long-story short, I was just thinking out loud to you great developers.

    Thanks,

    William

Leave a comment