Feature Flagging

A proposal of establishing a service or tool that can adequately provide a means to toggle flags (true/false) that can be assigned rules and strategies in order to hide and show UI elements in staging, demo and production environments without redeployment.

The following RFC discusses the issues with the current approach, requirements that need to be met for a feature flagging and potential solutions.

Problem Description

Enabling UI features that align with some valid business rules with current feature flagging mechanisms such as Django Waffle, has shown its limitations and without a custom implementation it is not possible.

With Django Waffle custom and complex strategies need to be implemented involving tests and development time but with the side effects of an increase in technical debt and logic complexity that can result in volatile results with existing and new flags as more flags and logic are created.

Setting up custom and complex strategies requires redeploying services and packages, as the logic behind the feature flagging needs to be included within the code base. This prevents active exclusion or inclusion of features in real-time.

Use cases:

  • Enable feature A for contacts where the client was activated after 01.01.2022 who speak a specified language and are not in a list of defined client-identifiers
  • Enable feature A for all clients from a specified country who speak a specified language
  • Enable feature B for 50% of all Spanish clients if their language is configured to Spanish or English
  • Customer X has been struggling with a new UI feature that is being trialed, and has requested to have the previous UI feature returned.
  • Test a new feature A against feature B to see if it improves the user experience, only activating it to 20% of the user base and the other 80% being the control group

Requirements:

  1. Ability to toggle flags on/off to show/hide/switch UI features or components without the redeployment of the service or the affected service,

    • Use case: In convert and pay, we want to show a message temporarily to a small group of users in Brazil over the weekend, this needs to be scheduled to switch on and off at a particular date and time.
  2. Ability to assign groups of users to flags based on various requirements based on user and environment e.g.: client-identifier, contact-identifier, dealer, country, language, client activation date, locale

    • Use case: A set of users have indicated they would like to see a new feature that would assist in how they can create a new beneficiary, assigning their client-identifiers and dealer we can show the feature to them.
  3. Ability to manage a limited rollout based on complex strategies

    • Use case: A new beneficiary section to add, edit and delete beneficiaries is being rolled out, however support for certain countries and languages is not yet complete, we want to not hold back the rollout and can assign users based on country and language. However, excluding certain users that have indicated they still need certain features of the previous version.
  4. Client library solution that integrates into current code solutions, e.g.: JavaScript, Python, Typescript

    • Use case: EBO frontend is running off Vue.js version 2 and in the future Vue.js version 3, support for Vue with Javascript is important for implementation. Python however is important as it is the backend for EBO and flags may in the future be needed there.
  5. User-friendliness of service, the UI should be intuitive to reduce training times and confusion about the use of flags as well as the creation of flags or editing of existing flags. (Support should be able to toggle flags in production & Ebury Mass Payments environments)

    • Use case: A user no longer wants to be in a testing group for a new feature and has asked to be removed from it, support has received the request and can easily navigate to the flag and add their client-identifier to be excluded.
  6. Permissions for different environments (ONL should be in full control of staging, but prod/demo should be accessed by support)

    • Use case: a flag that shows a new date picker that is being tested, does not yet support German date and time formats, we want to ensure this flag does not get toggled by accident in production and as such only limited users can access the flag.
  7. Audit logging for flag changes (security requirement)

    • Use case: it was discovered that a flag was enabled that showed a description for a feature that is still being developed, confusing users, using the audit we can see why and when the flag was changed to understand the root cause issue.
  8. Feature staleness reporting

    • Use case: an alert has indicated that a certain feature flag is no longer being used for a few weeks now, and as such we should investigate the flag and its use.
  9. Minimal impact on the performance of the application its integrated or used in, and the service needs to be optimised to not impact resources or affect latency of calls.

    • Use case: if the feature flagging service sluggish and is affecting the response calls, the user may become frustrated and leave the application, alternatively it may result in unforeseen consequences, and as such a fast reliable service is needed.
  10. Caching of flags (frontend and backend) to improve performance, and reliability in the case of unavailability of the flagging service, this cache should be invalidated by a mechanism when flags are updated.

    • Use case: the ability to utilise cache on the frontend and the backend will improve performance of the calls as well as add extra reliability for in the case of the flagging service becoming unavailable and the flags will still be available for the end user.
  11. A flag should not restrict the normal operation nor the functionality expected by the end user, if the flag is unavailable the user should still be able to execute what they intended to do.

    • Use case: a user who can expect to convert currency on EBO, should still continue to be able to do so, irrespective of the flag status or if the flag is unavailable.
  12. The main application should not be aware of the current flagging configuration and if the flagging service returns no data or is unavailable the flags should be disabled by default.

    • Use case: A feature that shows a new date picker is behind a feature flag, if the flagging service is unavailable, there should be no assumptions and the feature should be disabled and the original date picker should be available if one exists.
  13. SSO with Google to comply with Security and RBAC.

Nice to have:

  • Slack integration to allow for alerts when a flag is toggled, this however can also be executed through an email.
  • Open source licence
  • Easy to maintain, limited development time is used to ensure the execution of flags and the maintenance of the service.

Background

What is a feature flag

A feature flag is defined as an immediate boolean toggle that can be assigned a set of conditions that when toggled will either enable/disable a feature for the subgroup whom the conditions match.

It is different to settings, constants, and environment variables as those require a redeployment of the said services. In comparison to permissions, permissions are assigned to different roles based on various factors that define a role, whereas a feature flag can encompass multiple roles but also a very small subset within a role.

Django Waffle

What is Django Waffle?

Waffle is a feature flipper for Django. With it you can define the conditions for which a flag should be active, and use it in a number of ways.

Django Waffle on demo and production environments has access limited to only the support team to enable and disable various feature flags as required.

Limitations of Django Waffle

Waffle requires implementing custom logic to define strategies and these strategies require EBO to be redeployed if they are needed to change.

Notable limitations include:

  1. Writing of custom code to implement flags, this includes writing of tests for the flags
  2. Flags share the same attributes and logic in determining whether a flag is active or not
    • For example, a flag that is only interested in a viewers country and not their language, will still have language as an attribute where an admin user can add to and the flag will validate against.
  3. Audit logging lacks details, as it only details what attribute was changed, and not the values that they were changes from/to, making them unhelpful in the case of a rollback.
  4. No statistics on how often a flag is triggered, reducing the ability to track A/B testing or general use of a new feature that is being tested.
  5. Django only based environments

Comparison to permissions

Whilst similar in action, the distinction is the domain they control. Feature flags relate to features, whereas user permissions relate to security.

When are feature flags currently used

Feature flags are short lived and control the rollout of new features to certain demographics and user groups and once the feature is considered completely rolled-out, the flag is removed.

Solution

Is a Request for Proposal (RFP) for a feature flagging tool/solution that can be used by multiple channels to support the business.

Tools/solutions that will be considered in the RFP: 1. FlagSmith 2. FeatureHub 3. Unleash 4. Flagr

Alternatives

Alternative options include internal development of a feature flagging solution that can support the business, this increases developer costs to develop, support and maintain the service.

A second alternative is to continue with Django waffle with adding custom implementation and features to meeting the requirements of the business, however there are various pitfalls for this including:

  • Development time required to implement custom strategies and features required
  • Maintaining existing flags and strategies that have been written.
  • Accepting that deployments will be necessary for new strategies to be implemented
  • Lack of audit and oversight of features being toggled and used
  • Limited reach across the business as its based on Django.

Caveats

Use of a third party service provider has its trade-offs: 1. Monthly or annual costs 2. Risks that the service either no longer is provided and as such becomes unavailable. 3. Changes to the service that no longer align to our needs and requirements result in a new service provider needing to be found.

For example costs and features for a provider such as Flagsmith are: Flagsmith Enterprise Self-Hosted License: $12,000/year

  • Up to 20 Users
  • Enterprise On-Boarding, Quarterly Upgrade Support and General Chat and Email Support
  • 12 Hour Response SLA

Operation

This will be run on-premise, using Kubernetes (for staging and production), for demo & EMP environments the potential solution/tool can be run as a regular Docker container, this is to ensure our ability to keep the service running without a third party hosted solution becoming unavailable.

The ONL team will be in charge of operating it and maintaining it, with roles to SPP to allow them to control the impact of flags in demo and production environments in order to support our clients.

The service will only be available internally and not available to the public.

Security Impact

Authentication and authorization

Access to the service can be restricted to a set group of users with SAML/SSO and role based permissions, the service can maintain a 4-eyes flag approval, to ensure the double checking of the flags and their usage. Audit logs are also available to keep track of every action the team makes. Access to be provided through the relevant channels.

API and Tokens

The service communicates to the client side using a token key, this token is used to send attributes of the user session that is deemed important to create the necessary segmentation for flags to be used.

User Data

Data that is stored is symbolic to allow segmentation of users for feature flags, this data however can be as extensive or as restrictive as required. Personally identifying data would not be useful for flags due to the specificity of it and as such the risk of such data being used is greatly reduced. Data that would be included are traits such as client-identifier, contact-identifier, dealer, country, language, client activation date, locale. This data would be stored on premise in a database we control and only have access to.

Parsing of user-data

Data that is sent to the feature flagging service is used to determine whether the data sent matches the requirements for a flag to be set, it then returns a boolean value of that outcome. The data is not parsed or checked on the client side.

Unused and old flags

Old flags that are not used anymore should be removed both in code and in the flagging service, as old flags provide a risk where they can be repurposed for a newer feature or similar feature, which when toggled can have side effects such as disabling existing or enable features that should not be changed.

Performance Impact

As this service is used to acquire the knowledge of whether a user has the ability to see or use a particular feature, the service can be called multiple times in a users session, especially if the user's key traits that are deemed important for segmentation are changing, such as country or language.

However if the users key traits do not change, the service can be called only once per session to determine the relevant flags for them, which can then be utilised across the application where needed, minimising impact on performance or needless API calls to be created.

As the service uses its own database no interference with other databases will occur.

Developer Impact

Developers would be required to set up the initial flagging service within the application, where they set the user and their traits, and retrieve and store the flags the user has access to. Once the flags have been retrieved, the use of the flags would be used where a certain feature needs to be flagged using simple logic.

Maintaining where the flags have been used, and keeping track and removing older unused flags would be important to minimise side effects and technical debt.

References:

Django Waffle https://waffle.readthedocs.io/en/stable/types/flag.html

FlagSmith

FeatureHub

Unleash

Flagr