Business Process Automation Platforms Overview
Prerequisites
N/A
Reference Documents
N/A
Problem Description
The Business Process Automation (BPA) team is responsible for automating business processes, and we are struggling with the core platform we use to create our automations (Make.com).
Current Platform's Limitations
Make.com is a no-code platform that allows us to automate business processes. For trivial processes, it is a good platform; but for more complex ones, it is not suitable, as it pushes us to find workarounds to overcome the platform's limitations - when possible, as some cases are impossible to implement with Make.com (e.g. HiBob-Google Synchronization).
We spend a lot of time on workarounds, while still having to promptly deliver quality solutions.
The main limitations are:
Development - Complex solutions for simple problems
- Make.com doesn't have condition-controlled loops, and the count-controlled loop implementation that it offers (the iterator) does not allow embedding control flow statements like
breakorcontinue; - it's not possible natively to create reusable components, therefore we had to create our own module that uses a scenario as a component, but this involves making HTTP calls back-and-forth between scenarios, resulting in high coupling or over-fetching (which is very inefficient);
- all executions are completely synchronous, there is no asynchronous implementation of anything (not even API calls);
- some modules (f.i. Google Sheets) are iterators by default, forcing inappropriate data-handling logics and resorting to workarounds;
- changes and refactorings are challenging, mostly on par with a total rewrite.
Inefficient workflow
- there is no standard workflow;
- we cannot follow standard software flows, there are no best practices, and no testing;
- there's no version control system, only saved timestamps;
- there isn't such thing as an "environment", so we have to create different scenarios for each environment;
- there's no promotion-to-production, this can be done only manually, that's very time-consuming and error-prone;
- there's no approval system or branch protection, with a high risk of affecting production;
- no CI/CD.
Unreliable platform
- some request are not processed due to random timeouts;
- at times inexplicably scenarios are triggered twice;
- scenario are getting shut down automatically after repeated failures, and there is no setting to change this;
- when API responses are significant in size, they get automatically converted to
Long Stringwithout warning. - many executions fail without logs (f.i. HiBob/Google sync, Fenx);
- unreliable add-on for forms - several times it failed silently, and we had to create our own implementation using Google Apps Script;
- it is not possible to add operations to the existing plan, we cannot set limitations on usage or split billing of usage (per team, per scenario, etc.).
Bad support
- Support team slow and inefficient (it may take even a month just to reply) and not reliable; one critical example above all: once they made a change to a production module without asking us, and we prevented one or more incidents (as the module was used in several scenarios) only by promptly noticing the change and reverting it;
- the community is relatively small (StackOverflow/forums), so it is hard to find solutions;
- mostly we can't rely on what's provided by their support, but we have to create our own workarounds (f.i. CallComponent module) that turn out to be a better solution.
Background
The BPA team was formerly part of the Business Applications team. The team was later split into two teams to allow the Business Applications Support team to focus on the support of the business applications and the BPA team to focus on the automation of business processes. The primary platform used then was Make.com, as it was also meant to be potentially used by other team members that wouldn't have development skills.
With the development of more complex business processes, we reached the platform's limitations and had to find workarounds to overcome them. The result was that we were spending too much time on workarounds, the workflow was inefficient, and the created solutions were not reliable, not scalable, and complicated to maintain. We've begun reasoning on potential solutions that could be built on top of the existing platform. But we've realized that we were building workarounds on top of workarounds and that this would not be sustainable in the long run.
We've then started evaluating other platforms, and the first choice was Google Apps Script (GAS), as it is a platform that we are already using for other purposes, it is a platform that we are familiar with, and it's completely free. GAS has many advantages, mostly being a platform that has infrastructure completely managed by Google, thus allowing the development to be rapid and efficient. It also has a lot of built-in features; in particular, the tight and seamless integration with Google products that are at the core of most of our business processes,
Solution
We discussed the possible alternatives and approaches through many iterations, both internally and with other experts from Ebury.
We've concluded that the best approach is to use multiple platforms that would allow us to cover the different needs of the specific processes that need to be automated. Thus, keeping the maintenance and development costs low (ideally, zero) and the Time to Market (TTM) the shortest possible.
The platforms are:
Since each platform has advantages and disadvantages, we must use the best tool for the job.
Decision Tree
Given all the mentioned issues above, we opted for a decision tree that would allow us to choose the best platform for each process that needs to be automated.
The first option will be Make.com to achieve "quick wins" on basic processes; this will be evaluated using the Definition of Simple (DoS) criteria.
Then, if the process is complex, we'll resort to GAS.
If we reach some of the limitations of GAS that are not possible to overcome (or, at least, not in a reasonable way), we'll resort to Cloud Functions.
In case of particular needs, we'll resort to Cloud Run, which will need to pass an RFC process to be approved.
%%{init: {'theme': 'neutral', 'themeVariables': {'darkMode': true} } }%%
flowchart
UC((Use case\n to be\n automated))
GAS((GAS))
Make((Make))
CF((Cloud\n Functions))
CR((Cloud Run))
Criteria1{DoS?}
Criteria2{Out of GAS quota or\n NPM binaries required?}
Criteria3{Customized\n dockerfile?}
UC --> Criteria1
Criteria1 --yes--> Make
Criteria1 --no--> Criteria2
Criteria2 --yes--> Criteria3
Criteria2 --no--> GAS
Criteria3 --yes--> CR
Criteria3 --no--> CF
Definition of Simple (DoS)
- It can't cause an incident;
- the expected usage is less than 10% of the service quota;
- it doesn't have while loops;
- it doesn't need concurrency;
- it will use modules that are already available;
- it doesn't require data manipulation;
- it is a standalone process: it doesn't need componentization, or more than few actions (we'll set 10 actions as reference);
- it is not inconsistent with existing processes/initiatives (f.i. we don't want to use Make.com for processes that are already being implemented in GAS);
- it doesn't need nested modules that are iterators by default
A note on Triggers
In some specific situations, we'd use a combination of platforms for some processes. These will likely be "triggers" that are easier/cheaper to implement in one platform, while the process itself will be implemented in another. Therefore the decision tree will indicate the choice for the core platform, while the trigger will be implemented in the platform that is more suitable for it.
An example is "mail-hooks" for Gmail (a trigger to detect when a new email arrives), which is very easy to implement in Make.com, but not in GAS. In this case, we would use Make.com to trigger the GAS function.
Another example is the "chron jobs": in Cloud Functions, you only have three free time triggers, while in GAS, you have 20 per script (practically unlimited). In this case, we would use GAS to trigger the Cloud Function.
Service Ownership
The BPA team will be responsible for the development and maintenance of the processes that are automated using the platforms mentioned above.
Alternatives
Countless other options and platforms could be used to automate business processes. However, we've chosen the ones mentioned above because they are the ones we are already familiar with and already using for other purposes.
Also, except for Make.com, considering that the core of our automations (if not most of them) are based on Google products, it makes sense to use Google products to automate them.
Finally, for the usage we are planning to do, we should be able to keep the costs low, if not zero.
Caveats
There are some limitations that we need to be aware of and that we need to take into account when choosing the platform for a specific process.
One general principle is that if we should strive to keep our usage within the free tier, and in the event that we need to pay for a service, we should go through an RFC process to get approval.
Make.com
The platform is helpful when it comes to automating simple processes, but other than that, the limitations are significant in terms of scalability and reliability. Therefore, we should use this tool only on the rigorous conditions set in the DoS criteria to avoid incidents. Also, we should keep the number of operations to a minimum to avoid incurring outages or additional costs.
Google Apps Script
The platform is handy and powerful, allowing the development of complex solutions quickly. However, it has some limitations that we need to be aware of:
- there are hard quotas that cannot be increased, so we should provide forecasts on usage;
- we cannot use binaries and some specific NPM packages, so we should investigate the feasibility of a solution beforehand. Otherwise, we should resort to Cloud Functions.
Google Cloud Functions
This should be our choice only when we need to implement a process that is not possible to implement in GAS; we should be wary of the free quota and attempt to keep the usage within the free tier. Finally, the Docker file is not customizable in Cloud Functions; for this reason, we should resort to Cloud Run when we need to use a customized Docker file.
Google Cloud Run
This should be our choice only when we need to implement a process that is not possible to implement in GAS or Cloud Functions; we should be wary of the free quota and attempt to keep the usage within the free tier. At the same time, this is a potent tool that we should use only in particular cases and should be approved by the RFC process.
Operation
The automation team will operate the platforms and will not need the intervention of the platform teams to manage changes in the infrastructure. In turn, monitoring and alerting will be carried out by the same team.
Security Impact
The described approach should increase the reliability of the processes, procedures, and quality of the services, as well as the overall security.
We'd use the Google Secret Manager to store the credentials and other sensitive data. Still, we'd primarily leverage the authorization flows and built-in tools into the GCP project, allowing us to develop secure solutions while reducing the effort to ensure such security.
As GCP products rely on Google authentication, we'd be able to leverage the security features that are built-in into the Google ecosystem (such as the 2FA, the SSO, the audit logs, the security alerts, etc.) without having to implement and maintain them ourselves.
Performance Impact
The described approach should allow us to keep the time to market to the minimum while allowing us to scale and maintain reliable and secure solutions.
Developer Impact
Developers on the team will be able to focus on the core business logic while at the same time being able to leverage the tools and the infrastructure that is already in place.
The team should be able to learn to use the tools and platforms mentioned above, but the learning curve should be quite short, as we are already using these tools.
Data Contracts
N/A
Data Sources
N/A
Deployment
There won't be a proper deployment, as we are already using the platforms mentioned above, just in a better-structured way.
Dependencies
There are no dependencies, but it's worth mentioning that this RFC is related to our team's internal initiative to restructure our workflow, architecture, and database platforms; related RFCs are in the works.