Service Ownership

This document provides guidelines regarding the concept of Service Ownership in Ebury.

Problem Description

Operational processes relating to a service are essential in Ebury, in order to achieve new business goals with the correspondent quality level where the developer rotation between teams or squad creation mustn't affect it.

Furthermore it is important to have clear responsible on a service level with a clear vision both on the projects themself, and how it evolves within the bigger, global vision as much in short, as in a long term. Ensuring the service critically SLAs are observed, considered and accountability clearly understood when implementing or deploying new services/components or features.

Though Ebury had several tools in place to achieve a standard minimum level, the interpretation was much dependent on team culture, thus inconsistent across the company.

All in all, there was a need for better supervision, guidance and control over the Ebury Tech services processes.

With the above description, the service ownership will solve problems like * How and which team is responsible for the business continuity for a specific scope. * How, when, and which team is responsible for the processes to mitigate issues. * Defining domains borderlines from an architecture point of view to start discussing which other services are affected. * Align responsibilities across teams with the correspondent services within a domain. * Facilitate architecture discussions around business/data flows. * To have a contact point for any procedure around a service. * Assign responsibility for system upgrades, migrations, planning process.

Background

To satisfy with International Organization for Standardization (ISO), Service-Level Agreement (SLA), service monitoring, roll-out plan, internal or external audit, Service assessments Business Contingency Plan (BCP), Business Continuity Plan, as the minimum standards for any service provided by Ebury Technology being part of the design development process.

All those concepts are managed by teams in their own way, without a guideline, or an specific format.

Solution

Our solution relies on team level responsibilities, where a new service or existing one, will be assigned to a team, being highlighted in the RFCs as a mandatory requirement to include which services are affected. For a new service, the RFC will contain the team as owner.

As additional information, for all cases, the service will be part of a domain.

Firstly the project is assigned to a team, and it won't affect if a project/functionality is created by a consolidated team of any area, a squad, since the responsibility will fall in the team assigned.

Definitions

Service in Ebury is a software application that is represent for a base code in a repository with one o several functionalities, running in a scope.

BCP is the process involved in creating a system of prevention and recovery from potential threats to Ebury.

Monitoring Plan is a written plan that describes what will be monitored and how.

Rollout Plan is a written plan to deploy on production a new service or functionality with the correspondent communication to the teams involved on it.

Assessments Register is a document which contains all Ebury Tech assets

Tech Debt Plan is a document or Jira Epics which contains the components that must be improved in order to be up to date

Scoped Service Owner

Service owners typically have

  • a high-level, global view of the project combined with in-depth understanding
  • a vision on how the project integrates in current and future company needs
  • a good knowledge on solutions implemented within the project
  • a good knowledge about related projects, interactions and interfaces
  • a particularly strong business knowledge in the given scope
  • strong understanding and concerns of the given scope (business-level, service-level, flow-level, etc.)
  • a good knowledge about the documentation needed to have a service with a acceptable quality level.
    • Pipeline in place as a conductor to manage quality checks
    • Static code analysis, using tools likes isort or pylint with a score as a threshold
    • Security checks, using Aquasec
    • Unit/Integration/E2E tests to provide 100% coverage in an appropriate balance
    • Code Owners defined to get PR approved in order to merge the code
    • Enough and appropriate tests coverage.
    • Verification of migrations (i.e. no-downtime migrations, split in different and consecutive Pull Requests)
    • Relevant logging for the new code.
    • Relevant alerts defined and included in the monitoring/rollout plan
    • Training to support team as first line.
    • SLAs defined
    • Good documentation for new changes.
    • Follow the conventions style guide (i.e. language style, Folder structure, components naming and importing, ...)

Responsibilities

A team as a service owner have the following responsibilities:

  • The team will complete the documents regarding affected by new features in the service.

  • The team is the primary contact for an issue/outage.

  • The team should be involved in decisions for the long-term resolution of the issue (typically: postmortems).
  • The team should be involved in decisions for the resolution of new features / projects.
  • The team has the responsibility to provide the documentation for any audit (internal/external) associated with the service.
  • The team will be aware to provide solutions to any regulatory request associated with the service.
  • The team is in charge to highlight the technical debts inherent to a service and scale it to be planned.

It's also important to highlight functionalities, that are the responsibilities of the service owner, which could cause conflict with code owners' responsibilities.

Those service owners' responsibilities are focused on a high-level point of view, taking into account the business flows and the scope for a specific service, not being involved in the development / coding in itself, but having the knowledge about how was built

  • The team actively participating in shaping the vision for the future version of the application from a high level, business flow, not in terms of code / development
  • The team being involved in the technical decisions related to their scopes
    • technical changes
    • future architectural changes
    • ongoing/upcoming projects
    • etc.
  • The team will review, approve or reject contributions upholding architectural integrity
    • detecting major issues (design issues, inconsistencies across the project, etc.)