API Async Mass Payments
The document describes a proposal for asynchronously processing mass-payments submitted using the API.
Definitions
A mass-payment is a batch of one or several payment instructions, including currencies and beneficiary information, that the system will execute without requiring the submitter to create beneficiaries, trades and other resources beforehand.
Problem Description
The current API mass-payment workflow is executed synchronously in a request, therefore being susceptible to timeouts if the mass-payment batch is large enough (e.g. large payrolls). In addition, in the eventuality of timeout or network failure, there is no easy way for the API user to figure out the status of the submitted mass-payment.
API product owners want to provide a reliable, asynchronous processing of mass-payments to our customers [1]. It must work under the constraints of the FrontierPay environment (Kafka not deployed).
The operations of submitting a mass-payment and getting the status of a mass-payment should take at most 1000ms each.
Background
When an API user submits a mass-payment request, the API orchestrates the workflow by making several requests to BOS.
In case of an API request timeout or connection failure between the user and the API, then the user does not receive the payment information.
Moreover, as the orchestration of the workflow is implemented in the API webapp during a user request by making several requests to BOS without a retry-later mechanism, the processing can fail in case of timeouts or connection failures between the API and BOS. The user would be forced to resubmit the mass-payment and possibly duplicate payments.
Ideally, the API call should return as soon as possible and the processing should continue in the background.
Solution
There are a couple of constraints for this feature:
- it must work in the FrontierPay environment, where BOS does not have access to Kafka
- PAB team are working on the payments query service. We want to avoid effort duplication and the design of competitive solutions.
We propose a straightforward solution that transfers the orchestration of the mass-payments from the API to a BOS background job (Celery task).
BOS will expose a new endpoint that will receive the mass-payments file, create the mass-payment information (UUID and initial status) and start a background job to complete it. When an async mass-payment is requested, the API will call this endpoint and give the user back the mass-payment info (UUID, status). The API user can check the status of the mass-payment using its UUID.
The background job will implement the same workflow that is currently used by the API sync mass-payments, reusing the logic in the BOS views and controllers. In order for the users to get a finer grained status of the mass-payment, we define extra status values and update the mass-payment status at certain points during the workflow, with minimal changes to the BOS code and database schema.
We plan to also keep the synchronous functionality for now.
To summarize, this solution:
- does not have extra dependencies, which makes it suitable for the FrontierPay environment
- does not need infrastructure or security changes, reuses the BOS background job system with the already in place monitoring and operational practices.
- does not make changes to the BOS mass-payment business logic.
- reuses the current mass-payments code, preserving the current ownership. The work on Ebury 2.0 payments will not be affected by competing solutions and duplication of effort.
- requires little changes to the database schema - a new UUID field for the mass-payments. Extra mass-payment statuses need to be defined in the application layer.
- reuses the BOS database for keeping track of the mass-payment UUID and status.
- is easy to test, reuses BOS testing infrastructure and functions that are already tested.
- is fairly quick to implement.
Future Enhancements
We consider this a temporary solution and we expect that it will be superseded by Ebury 2.0 payments in the long run.
Alternatives
Caveats
Adding more functionalities to BOS is not recommended under the Ebury 2.0 strategy. The async mass-payments workflow reuses the current BOS operations rather than creating new ones. But a new background job will be created to orchestrate these operations. However, given that the PAB team are working on Ebury 2.0 payments at this time, we believe that the solution avoids duplication of effort or competitive implementations.
Even if the required changes look minimal, we will be modifying a complex functionality. We counted 9 incident postmortems related to mass-payments during the first quarter of 2021.
Operation
The async workflow will run as a background job of BOS and will have the same operational requirements as the existing BOS background jobs.
Security Impact
None.
Performance Impact
The current synchronous API mass-payment workflow makes several requests to BOS API, waiting for the entire process to complete and keeping API and BOS workers busy during this time. The proposed async workflow makes only one request to BOS to upload the data and start the background process, thus freeing API and BOS worker time for other requests. There will be extra load added to the BOS background job system.
Developer Impact
None.
Data Consumer Impact
An extra UUID field for the mass-payment is needed in the database.
We need to extend the value domain of the mass-payment status.
Deployment
None.
Dependencies
None.
References
[1] JIRA Epic https://fxsolutions.atlassian.net/browse/API-436