Routing Rules Refactor

The target of this RFC is providing an overview of the design decisions made for creating a new service for routing our payments and extracting this logic from FXS.

Problem Description

The scope is to reproduce the current routing rules we have in place (MVP), and to improve regarding our strategy in the middle-long term, taking into account the Ebury Mass Payments EMP (AKA FrontierPay) flow integration.

This service must be able to process a huge number of transactions to support the Ebury growth and the Ebury Mass Payments EMP (AKA FrontierPay) integration.

We need a single source of truth for routing payments which is a scalable, observable and robust service to manage payments routing rules at Ebury.

Definitions

A payment scheme is a set of rules which have been agreed per a regulator upon to execute transactions through a specific payment instrument.

A payment instrument is an (electronic) instrument with which end users of payment systems transfer funds between accounts at financial institutions or payment service providers (PSPs).

A payment system is an (market) infrastructure that processes transactions in line with the rules defined in a payment scheme.This includes the institutions, instruments, people, rules, procedures, standards, and technologies that make its exchange possible.

A scheme bank intermediary is a financial institution through which a payment service provider (PSP) or another financial institution gets access to the payment system. In the payment industry, we will define the bank scheme intermediary as PSP or sponsor.

A Payment router is a service that uses one or more metrics to determine the optimal path along which payment traffic should be forwarded.In the payment industry, this process is called the payment way determination (PWD).

A Payment route is the path taken by a payment instrument to be released through a certain Bank Scheme Intermediary.

An automatic payment is a payment where the payment route might be defined following an automate logic where the source account can be selected automatically and where the payment message might be created and released automatically to the bank scheme intermediary.

A semi-automatic payment is a payment where the payment route might be defined following an automate logic, where the source account can be selected automatically and where the payment message might be created automatically in a folder but cannot be released automatically to the bank scheme intermediary. The payment message still needs to be pushed/downloaded manually by a user to the bank intermediary platform.

A manual payment is a payment for which each step of the payment execution process should remain under the control of a user and requires manual action for selecting the payment route and the source account and for instructing the payment manually via a banking platform.

Ebury Entity is the Ebury Entity where the client belongs to for regulator and T&Cs perspective.

Ebury Branch is the Extension of an Ebury Entity with it owns regulatory.

Source Account means the settlement account under the name of Ebury we have designated, from which funds are to be used for a fund transfer under a bank scheme intermediary. Also known as Settlement account which is an account containing money and/or assets that is held with a central bank, central securities depository, central counterparty or any other institution acting as a settlement agent, which is used to settle transactions between participants or members of a commercial payment system system.

Background

When a payment in our system reaches maturity and it fulfils all the conditions to be released, it will be sent to FXS. At this point, FXS has a routing process for deciding which payment instrument and bank scheme intermediary will release the payment received. The idea is extract this logic from FXS to a different service.

We currently have 3 possibilities to route payments in our system:

  • SCT (Disable)
  • Faster Payment (FPS)
  • Swift

Each payment instrument, could have more than one bank scheme intermediary for releasing a payment. In case of a SCT payment, we could release a payment via Arkea (disabled), ABN or Santander (not implemented yet). For Faster Payments (FPS) , we are using Barclays UK as bank scheme intermediary. Once the routing select the payment instrument, the source account received from BOS is overridden by the one used for the payment instrument.

In case of Swift payments, the system is checking in a different routing table, the bank scheme intermediary for releasing a Swift payment. This table in FXS is called PaymentSchemeIntermediary. If the system could find a source account in this table, for the payment currency, the beneficiary bank country and the Ebury Entity received, the bank scheme intermediary will be selected as Citibank Uk/Barclays DE/Barclays UK and the source account received from BOS will be overridden with the one included on this table. If the system cannot find any match on this table, the bank scheme intermediary will be Barclays UK and the payment will be released via our old flow.

We need to merge these two process (Payment Instrument + Bank Scheme Intermediary selection) in one. Our new routing service will return this information in one step. This new service should allow operation's team to configure the routing rules on demand.

For FPS and SCT payments, apart from the routing rules, we need to check the payment instrument allowed by the beneficiary for releasing the payment. For doing this, we are calling to Apply Financial (Third Party). We would like to include all our rules for deciding the routing on the engine rule, but we cannot do it with the Apply Financial information. We already have a Cache in FXS with the payment instruments allowed by bic or country and bank id. We don't have all the bics/country+bank_id on this Cache so we would need to call Apply Financial for new accounts and also, this information could change so the Cache expires and we need to renew it. We are pending to analyze if we could obtain a copy of Apply Financial database and create a new Static Data Service with this information, but from now, we need to extract this logic into a different step.

Today we have in place for Ebury core business different connections for routing our payments. We already know that Ebury Mass Payments has and needs different capabilities.

As this approach requires to be configured, operations team needs an administration console where it would be possible to configure the payments routes. The configuration of a payment is a sensitive configuration as it can have a big impact on the execution of our payments but also on our cash management.

Solution

We would like to build this new process following the new Ebury architecture vision.

As we need to cover Mass Payment in this new routing process, we would like to extract the routing logic from FXS.

In the following image, you could see the solution proposed:

L3_ROUTING_P1

We could check 3 different flows, one related to the payment routing creation/updates (A.1-A.3), a second one related to how we will route our payments (B.1-B.6), and the latest one (C.1-C.11) related to re-routing payments.

Routing Rules Console (A.1-A.3)

ROUTING_RULES_A1_A3

As this approach requires to be configured, we need an administration console where it would be possible to configure the payments routes. These routing rules will be based on the payment information received (currency, beneficiary bank country, Ebury entity, purpose of code, etc.). They need to define a preferred order to release a payment to a route or to another one as a payment might correspond to all the criteria to be released through different routes (i.e. GBP payment could be released thanks to FPS and Swift Barclays UK):

ROUTING_RULES

This routing service should allow operations team to enable/disable the payment instrument/bank scheme intermediary and to define some settings like the daily max amount for a specific payment instrument:

ROUTING_RULES

The routing rules defined will select the payment instrument, and the bank scheme intermediary for releasing a payment.

For creating this new engine rules, we will create a new service called Payment Scheme Static Data Service. The users will access to a BOS interface which will be connected to this service for creating/updating routing rules (and source account information in a future project) and each time that a new change is saved on this service, an event will be published to our Event Bus.

Apply Financial Validation

As previously commented, the system needs to validate in Apply Financial the scheme of the beneficiary selected. For doing this, we will generate a new Bank Scheme Gateway which will be connected to Apply Financial for extracting this information.

How to route a payment (B.1-B.6)

As we need to extract the routing logic from FXS, we will create a new service which will manage our outgoing payment flow in the future, we will call it Outgoing Payment Command Service. It is a command side service managing the full life cycle of a Payment (the creation, screening process, pre-execution, routing, execution, reconciliation, etc.). It will include all business logic related to a payment.

This service will contain a database which will be updated when a routing rule changes on the Payment Scheme Static Data Service (it will be subscribed to the event previously published). This database will be created for performance and data resilience for the routing process. In case of the Payment Scheme Static Data Service could be down, the system will continue releasing payments in our platform.

ROUTING_A_PAYMENT

When a payment is ready to be executed in BOS, it will publish an event (B.1) to our Event Bus. A process subscribed to this event (B.2), will extract the possible payment instruments and bank scheme intermediaries from the Outgoing Payment Command Service database (B.3) and the payment instrument allowed to the payment beneficiary from the Bank Account Validation Service (B.4). This process will use this information for deciding the routing and will publish it to our Event Bus (B.5).

A process subscribed in FXS (B.6-1) will validate if the payment instrument selected is an automatic one, and in that case, the payment will be created in FXS and will follow our current flow for sending payments to different schemes.

A process subscribed to this event in BOS (B.6-2) will update the payment information (payment instrument and bank scheme intermediary) and in case of Mass Payment, the payment will be sent to a new manual queue. We are still pending to receive the requirements related to Mass Payment and define this new flow, but for sure, following our new Ebury Architecture we will need a new service (B.6-3) for generating manual "SIF" files (D.1) (as the ones we will have for managing different payment schemes):

PAYMENT_SCHEMES_GATEWAY

How to re-route a payment (C-1-C.11)

We currently have a capability in FXS which should be also covered on this RFC as part of the routing process. The operations team has the possibility to switch off the payment instrument on the routing process. So the routing process won't release payments to a payment instrument which is disabled. Operations team could also stop the sending process to different payment instruments. This flow is only valid for FPS and SCT payment instruments.

When an operation team member receive an email because FPS account has no enough balance for releasing FPS payments, they switch off the sending process in FXS (C.1):

PAYMENT_SCHEMES_FXS

In case that they cannot move more money to the FPS account, they will disable the scheme on the routing rules panel (C2). Our new Payment Scheme Static Data Service, will publish an event with this routing rule update (C.3) and the Outgoing Payment Command Service will consume this event and update it database (C.4):

RE_ROUTING_PAYMENT

The operation team member, could then select the payments with FPS payment instrument selected but not executed to pass the routing process again (C.5). Then, FXS will publish an event per each payment (C.6) and a process subscribed to this event on the Outgoing Payment Command Service, will consume it (C.7). When the routing process validate the payment data (C.8-C9), an event will be published with the payment information requested for releasing the payment (C.10). A process subscribed in FXS (C.11) will update the payment information with the new routing information and will release the payment following our current flow. A process subscribed in BOS, will update the payment information on this service.

We should have into account, that on a phase 1, BOS will only publish an event when the payment ready to be executed is an automatic one. We need to know first the Mass payments routing conditions and flow in order to verify if this is the correct scenario or if we should include a pre-routing process in our Outgoing Payment Command Service for sending payments to a manual queue or for passing the routing rules previously defined. This is quite critical because we could select a payment to be executed via an automatic payment instrument, and after switch off this payment instrument, the routing could decide that is a manual payment and in that case, we will have a data inconsistency between BOS and FXS. When we remove our payment flow from FXS to the Outgoing Payment Command Service, we won't have this problem. We should also analyze if we want to send all our payments to the new routing process (intra payments, cover payments, etc.).

On this RFC, we are only covering the routing rules process. We've included the Source Account rules as part of the Payment Scheme Static Data Service on our diagrams as it's entirely related to the payment routing process. When we will generate the RFC for the source account selection we will analyze if this is the best solution, or we should split it in a different service, but at this point we think that the best option is including the source account selection on the Payment Scheme Static Data Service.

Daily Limit Integration (D.1-D.4 / E.1-E.11)

For FPS and SCT payment instruments we need to include an extra validation related to daily limit allowed. This is a validation to have into account before selecting these payment instruments. This value could change in the future depending on the payment instrument rules, so we need to include a new setting in our Payment Scheme Static Data Service which will be updated in our Outgoing Payment Command Service following the normal flow for creating/updating routing rules (D.1-D.4).

DAILY_LIMIT

When we receive a payment to be routed in our Outgoing Payment Command Service, (E.1-E.4), and the routing decides that the payment instrument is FPS/SCT, an update related to the daily limit should be sent to the database (E.5). If the system could decrease this value with the payment amount, the payment iniciation requested event will be published with the payment scheme (SCT/FPS) and bank scheme intermediary selected (E.6).

If the system could not decrease this value, that means that the daily limit is passed, so the routing rules will select the next possible payment instrument following the routing priority on the engine rules. If the system could not find any routing for the payment selected, the system should publish an event indicating that we cannot route the payment, discarding it in BOS, following product requirements.

Pro

  • Use a different data model on Payment Scheme database
  • Total independence from the Payment Scheme Service
  • We could turn off the Payment Scheme Service as we have a reply on Payment Service, so we will decrease the costs of having this service running. The routing rules would not change a lot.
  • We could aggregate data from other services and create new rules with the information obtained (check "Add new rules in the future for SCT Payment Instrument" section).

Con

  • Cost of having 2 database (pending to be analyzed)
  • Harder implementation than having a Cache (sync data, new model data, etc.)

Rollout in phases

We've decided to deploy this project in 6 phases, testing on each one new functionality into production instead of activating the project at the end of the whole development.

Phase 1: Payment creation in FXS via events

In this first deploy, we will change the way we are generating payments in FXS. In our current flow, a payment is created in FXS via API. We will change this for creating the payment via events.

L3_ROUTING_P1_PHASE1

On this first phase, BOS will publish an event to our event bus when a payment is ready to be executed. FXS will be subscribed to this event and will create a payment this event is received. We will need to change this in a future phase for creating the payment when the event contains the payment instrument and the bank scheme intermediary (after routing process).

Phase 2: Build engine rules

On this phase, we will build the Payment Scheme Static Data Service with the engine rules for routing payments and the BOS interface for managing them by ops team. We will need to build an API in FXS for using these new rules instead of continue using the rules harcoded in FXS:

L3_ROUTING_P1_PHASE2

Phase 3: Build Bank Account Validation Service

The purpose of this development, is building the new service for connecting with Apply Financial and starting to use it. We will build the connection with our third party and an API on FXS for using this new service and deprecating our current code in FXS.

L3_ROUTING_P1_PHASE3

Phase 4: Build the payment service

On this step, we will build the Outgoing Payment Command Service (which will contain in the future the payment pipeline) and it's connection with the Payment Scheme Static Data Service (engine rules) and the Bank account Validation Service. On this phase we will test that the sync between our different new services are properly working before activating the entire flow:

L3_ROUTING_P1_PHASE4

Phase 5: Routing Rules via Events

On this phase, we will build the rest of the requirements needed for activating the project:

  • We will change the payment generation in FXS using events with payment routing information
  • We will build the changes need in FXS for re-routing a payment

Phase 6: Remove deprecated code

On the last phase, we will remove the deprecated code in BOS and FXSuite.

SIF Generation (Out of the scope of this RFC)

Once we receive the requirements for Mass Payments, we will update this RFC with this flow and generate the SIF Gateway Service for generating manual SIF files.

Alternatives

Proposal 2: Outgoing Payment Command Service with a Redis Cache

This solution proposes to create a Redis Cache on the Outgoing Payment Command Service with the Routing Rules information. When an operation team member creates/updates a routing rule, this event will be published to our Event Bus and the Cache will be invalidated (updated):

L3_ROUTING_P2

Pro

  • Easy to implement/maintain

Con

  • If we need it in the future, we could not group data/information from different services
  • When we are routing a payment, at some point, we will need to renew the Cache and the timing for routing this payment will be increased

Proposal 3: Apply Financial Static Data Service

In our current flow, we are checking beneficiary bic/sort code information to be reachable via FPS or SCT as the last step on the routing process as we want to avoid doing unnecessary request to Apply Financial (cost reasons). With this solution, we need to split the routing decision in 2 different steps: checking the engine rules and then validate the payment instrument with Apply Financial.

We have the possibility to make these checks in only one, containing all the information needed for the routing selection into the Outgoing Payment Command Service database.

L3_ROUTING_P3

For doing this, we could do the validation of the beneficiary data as soon as we can, on beneficiary creation. Once a beneficiary bank data is filled in BOS, an event will be published into our event bus (1), and a process subscribed into a new service called Apply Financial Static Data Service will save the beneficiary account data into a database (2) and will validate into Apply Financial the payment instruments allowed to this information (3). As the Apply Financial information could change, we should create a process on this service for updating this information (creating a new Apply Request) after some time (we currently have a Cache in place which expires in a month). Each time that a new update is stored on the Apply Financial Static Data Service database after an Apply Financial Validation, an event will be published to our event bus (4), ant the Outgoing Payment Command Service will be subscribed to it for updating its database. With this solution, we will have all the information needed for selecting the payment routing on the same table, we could create aggregation on this database with the information received from Apply, and we don't need to request Apply Financial on payment routing.

Pro

  • All the information needed for routing a payment on the same database (only one step)
  • We don't need to validate the beneficiary data before routing a payment
  • We have not a bottleneck releasing payments because of Apply Financial validation (scalability)

Con

  • The new flow for renovating beneficiary information with Apply Financial, will be checking it even when the beneficiary is not used for releasing payments (more cost)

Add new rules in the future

At this point, we don't need to include a bank holiday validation as we are not releasing payments via SCT, but in the future, for managing data that will be used by other services like COT and Bank Holidays we suggest the following solution:

L3_ROUTING_NEW_RULES_SERVICES

We will generate a new service which contains bank holidays and will be manage by ops team. When an operations team creates/updates bank holiday information, the system will publish an event to our Event Bus and the Outgoing Payment Command Service will update it database with this information and will manage it with the existent routing rules data for selecting the correct payment instrument and bank intermediary.

Operation

We will generate metrics with Prometheus for validating that the new routing and the old process generate the same output without impacting our payment execution.

On next phases, they will be managing the rules for selecting the payment instrument and bank scheme intermediary for releasing payments in our platform so in case of a mistake, it will impact all our payments execution and it could affect negatively to the company.

Security Impact

The configuration of a payment routing is a sensitive configuration as it can have a big impact on the execution of our payments but also on our cash management. To access this administration console, it will be necessary to create different group of users with different rights:

  • Consultation only (Users)
  • Consultation and Edit (Admin users)

It is highly recommended to put a strong process for changing any configurations (Approval of PO, OPS, Support and Tech). This configuration needs to be accessible by Ops team/PO.

Performance Impact

The new service will support increasing global reach for payments and Mass Payments business growth to 100.000 payments per month.

The new routing service allows horizontal scalability.

The new routing service has a different database, so we can fine tune the data structure and the db technology to fulfill the 100K requirement and potential increases on this easily.

Developer Impact

N/A

Deployment

For deploying infrastructure we will use Terraform to define all new resources that needs to be created. Infrastructure will be deployed in terraform-natonly.

For deploying code we will use existing Jenkins CI.

Caveats

With this solution we are solving the problem on payment routing and source account selection, but we will have a bottleneck on the Balance process and daemon executions. So we also need to solve this part for managing mass payment flow in our platform and make the payment execution scalable, observable and robust. This new balance service could be included as part of our Payment Execution pipeline (Outgoing Payment Command Service).