Engineering

Security > Stability > Performance > Functionality

When I was interviewing at Asana, I had the question that many security practitioners do: does Asana really care about security? For many of us, this isn’t some appeal to an abstract ideal, but rather about our daily work. It’s hard to show up every day when your work isn’t appreciated, or when you’re overworked because the-powers-that-be think your department is a cost center rather than a key aspect of the business.

The answer I got during my interview surprised me: we have detailed guidance in our highest level strategy documents that explicitly values security first amongst stability, performance, and functionality. This means that if a team is trying to ship a new Asana product feature that isn’t secure, stable, or performant, it likely won’t ship until it is.

What follows is a very lightly edited version of our real internal guidance on how to think about security, stability, and performance tradeoffs when building Asana. We think this strikes the right balance, and we’re proud to share it with you.

The framework

When prioritizing between different pieces of work, you should be applying our product prioritization principles, which includes prioritizing Security > Stability > Performance > Functionality

This is because:

  • Without Security and Stability, Asana loses trust, which is difficult to regain and could cause folks to stop using Asana as a whole. It’s more difficult to regain trust after a security incident, all things being equal, so it should be prioritized first.  
  • Without Performance, your feature will be hard to adopt and folks won’t use it. 

When should I use this framework?

While anyone can use this framework, the intended audience is leaders who are making decisions about prioritization. We intend this framework to be used:

  • In day to day work as bugs are discovered, and
  • When doing sprint and/or project planning, especially in spec review and during design doc iteration. 

When applying this framework, note that any action takes some risk. For example, any feature we launch in the API could have some scaling bug we missed, causing stability problems. Eliminating all risk is impossible, and even eliminating almost all risk would require sizable trade offs against velocity.. 

This framework is most useful when you’re comparing apples to apples across different aspects. If you have a new feature that an important customer asks for, but introduces a P1 security bug, it’s probably more important to mitigate that security risk than it is to launch that feature. 

This doesn’t just apply during project planning, but also in day to day. If your team is required to help in an incident, consider whether the incident’s security, stability, or performance impact is more important than your scheduled project work.

When shouldn’t I use this framework?

When you’re thinking about prioritization, the answer is almost always: it depends. For example:

  • How big a stability risk are we talking about? 
  • How long will that security bug remain open? 
  • How popular is the feature that will have a performance degradation?
  • How long will it really take to fix? (including new tech debt)
  • Are there other critical factors, like data loss, problems with payment, or very public problems?

As pointed out above, this framework is most useful when we’re talking about apples to apples comparisons on some very particular axes, but often that’s not the case. Ultimately we’re trying to drive value for our customers, and unfortunately that’s a really hard problem. Any prioritization framework, including this one, shouldn’t be applied blindly – if the result of prioritization isn’t “more customer value” then something somewhere has gone wrong.

Why this framework, and how do I use it?

Security and Stability are about trust

To see why, consider what would happen if Asana had a bug exploited that leaked customer data, or we had severe downtime. Customers would reconsider whether Asana is a tool they can trust with their important work data, and as part of their company’s infrastructure, potentially moving to another tool as a result.

We want Asana to be very secure. As a result, consistently, earning and growing customer trust is a top level objective for the company, with security and stability forming a large part of that. We have security > stability here, because it’s harder to regain trust after a major security incident vs. downtime1, but we think it unlikely these are often in conflict.

When thinking about security and stability during project planning, you would want to consider for example:

  1. The risk. How likely is it that this will cause a security incident or downtime? While you want to weigh these risks heavily, we can’t reduce this risk down to zero. Instead, make sure that you’re engaging your partners in infrastructure and security, using them to help evaluate the risk, and making sure your actions don’t unnecessarily open Asana to security incidents or downtime. 
  2. How this works “at scale”. This includes especially thinking about our largest customers: if a feature won’t work for them, we lose enterprise trust.
  3. The worst case behavior. What happens if someone uses this feature through the API in a heavy way? What happens if someone doesn’t supply the input you expect?

What happens if something unexpected goes wrong. How will you detect if your feature starts causing stability issues? How will you detect if someone starts abusing your feature?

Performance means folks can use a feature

If a feature is not performant, folks can’t adopt it, especially our largest customers. Your feature can be awesome, but, if it can’t be used, its awesomeness is moot. This is less important than making sure we retain customer trust, but more important than launching new functionality, as without performance our existing functionality won’t be used.

Some, not exhaustive, aspects of usability to consider:

  1. Typical / exceptional performance. Consider both the median case (p50) but also users for whom this might be slowest (p95/p99). 
  2. Performance for our Enterprise domains: How will your feature scale for a project with tens (or hundreds) of thousands of tasks, or in a domain with hundreds of thousands of users? What about millions?
  3. Overall app performance: If your feature is fast, but it causes page load to be slow, that’s not good for the product overall.

New functionality use the above as a foundation

Obviously we shouldn’t spend all our time on security, stability, or performance at the detriment of improving the product. We might decide, for example, to build out a feature that is not as performant at launch to solve an important enterprise need in the short term. In general, we think that this framework helps frame these non-functional requirements in the right order to maximize user value, but it’s up to teams to balance these. 

How Asana has applied this framework

There are two main ways we use this framework, actively, and passively.

The most obvious way is when we actively invoke this framework when making decisions. We’re all reasonable people and we usually make good, informed choices that maximally benefits Asana. We try hard to not just have a narrow outcome like a single launch or an abstract notion of security. But in situations where we can’t come to an agreement, we can use the framework to break the tie. This is pretty rare!

Most importantly, having this framework at all communicates to all Asanas our agreed upon prioritization across Asana. It seeps into every team’s roadmap a little bit. Every engineer thinks about it a little bit. Every PM accounts for this time a little bit. All of those little bits add up to a lot more security. This is the biggest impact of this framework.

What do you think?

So there you have it, a peek behind the curtain of how Asana prioritizes security, stability, and performance. It’s helped Asana do the right thing in many cases where the tradeoffs were large and the outcomes unclear.

Footnotes

[1] Some rationale for this: it’s easy to accumulate risk by having bad security hygiene, by, for example, believing your change won’t be the thing that causes an incident. But as this risk accumulates, the chance of it being discovered and exploited does too. When it’s exploited it’s much worse than e.g. a bad crash: we can’t just roll back code and apologize to users, we have to notify them that their data may have been accessed.

Would you recommend this article? Yes / No