Engineering best practices for organizing Development Tasks and resulting Code Reviews

This is a proposal for defining the organizational units of development tasks and code review guidelines for Ebury.

There are three higher-level objectives that determine all suggestions for this matter:

  • traceability
    • transparency (auditability)
  • release concerns
    • rollbacks
  • easily and safely reviewable PRs

tl;dr from a practical aspect, every proposed change should be written & reviewed with the following high-level points in mind:

  • Contain no unrelated changes, for audit & traceability.
  • Be easy to review, as small as possible, and conceptually exactly one change.
  • Be safe to release, and as easy as possible to revert when things go wrong.
  • Be well designed, understandable, and well tested.
  • Be secure and performant.

Detailed explanation of how to use tools — e.g. git, BitBucket, GitHub, Jira, etc — is out of scope, although hints may be included when useful, especially if related to critical Ebury practices & processes.

Problem Description

Even when resolving small Jira tasks, Ebury developers often come across a couple of "little extra" (isort fix, trivial bugfix, refactor, "gardening", etc.) issues, that must or at least should be resolved once touching the corresponding chunk of code.

However, this doesn't only increase the scope of the task, but makes it difficult to follow what exactly was included finally in the code change.

Task resolution (i.e. code modifications par task) criteria is not uniform and vary a lot across developers, some focus on style changes, others in performance, others in structure, etc.

This is a major issue: - for traceability - for determining where bug may have been introduced and what commits to roll back - for Code Reviews that end up impossible to follow and slightly chaotic

Disclaimer: Throughout the whole blueprint we discuss the stage AFTER a good refinement of tasks. We assume well-refinecd (typically 1-3 story points) tasks for the discussions below. The objective of the blueprint is NOT to highlight on refinement mistakes. Instead to highlight on the variety of expected or unexpected, logically distinct changes that are often included within the scope of a "healthy", well-refined Developer Task. The granularity of these, logically separate changes are beyond the scope of Task Refinement, but only make sense on a Task Resolution level and stage.

Solution

We propose a guideline developers should refer to cover the relevant points we care about.

These guidelines primarily encourage a good engineering practice from the initial stage of the task resolution onwards during the development phase. As a result Developers should deliver tasks in units that correspond to Ebury requirements.

There is a particular emphasis for the Developers to identify separate logical chunks belonging to the task, and group these changes separately (within the scope of the same task). Once the main task is de-coupled to smaller logical units, these should be separately - traced (Jira ticket) - developed (branch) - reviewed (PR)

Parties involved:

  • Developers: who submit the Pull Request (PR) for review.
  • Reviewers: who review the changes.
  • Code Owners: who may do a more in depth review and whom approval is mandatory in order to proceed.

Proposed Development and Code Reveiw Guidelines

Listed here are the points we consider important to cover in a code review.

Discourage unrelated changes

A PR must only contain relevant changes. Ask the developers to push unrelated things (e.g. a refactor, tests improvements, lint errors, style cleanups, etc) in a separate task.

This helps keep code changes small and focussed, makes reverts less risky (reverting one thing won’t revert the other by mistake), and improves traceability changes.

IMPORTANT: due to financial regulations, if there is no requirement for this change, you believe the code varies substantially from the requirements or substantial unrelated additional code has been added to the PR, you must decline the PR or mark it as requiring changes.

Encourage de-coupled tasks and small branches as much as possible

Small tasks are beneficial for many reasons as they are:

  • Easier to describe because they should address exactly one change (traceabilty).
  • Code changes are likely to be reviewed quickly and more thoroughly.
  • Less likely to introduce bugs.
  • Safer to merge and easier to revert.

Not counting generated files (e.g. protobuf definitions, Django migrations, etc), developers should aim to keep branches relating to a task into a sensible soft limit of at most 500 lines of code including tests. By trying to fit that rule in a pragmatic manner, the quality of the reviews increase a lot and have less chances of missing something important.

It's common to identify small chunks of work (refactor, test improvements, lint fixes, etc) as work progresses. Wherever possible, these chunks should be extracted to have their own dedicated split task. Performed within the scope of the original task just separate.

When looking for ways to split work, consider:

  • Would you ever want the change to be reverted if the feature caused production issues?
  • Does the change make sense on its own, i.e. can you describe its purpose?
  • Would the change be something another developer may find useful and/or may depend on.
  • etc
Practical considerations when de-coupling tasks

Splitting work is likely to mean creating & adjusting sprint tickets for tasks required during the release process. This is expected. However, you should try to keep the "story points" estimation the same overall.

We would like to emphasize at this point, that a new ticket is necessary. Mulitple branches on the same ticket won't do. The latter is: - not providing additional traceability - not nearly to the same extent as a Jira ticket - prone to mistakes - 1st branch merged, "secretely" (NOT reflected in SECO) goes in the release while the ticket is still in a "Code Reveiw" with the 2nd branch

Ways to split tasks:

  • From the JIRA Scrum Backlog view, right click on the task and hit "split task". Then you can have a clone task you can use for your pipeline of changes.
  • You can create a new task, and link it ('Split from') to the original task
  • Split tasks normally should have the original story points split among themselves

We need to emphasize that Jira sub-tasks are not convenient. SECO has no understanding of sub-tasks, thus these tickets are not displayed there. This is against the principal of traceability.

Keeping split tasks in sync:

  • While developing a big change, it's OK to go for a full implementation but it's important to determine isolated parts ready for review and submit them. For this git rebase is your friend: if a dependency of a "sub"-task changed, you should sync (rebase) to the up-to-date vesion of that dependency
  • Another good idea is to use commits to express related sets of changes rather than random savepoints, then it's easy to move separate commits to other branches.
  • Always rebasing from upstream rather than merging is a good idea to avoid "Merge branch 'dev' into ..." commits.
  • Futher git commands that will help: git add -p; git stash -p; git rebase -i; git rebase --onto

For big changes (e.g. refactoring) generated with automated tools, the task/branch/PR should only contain the automated tool changes and nothing else. Developers must also provide the steps to recreate the PR diff so that reviewers can double check that the output matches the current diff. All of this is to avoid burying unwanted changes within big change-sets.

Specific advices relating to Code Changes (and Code Reviews)

We would like to specifically mention a few issues particulartly relating to code changes (impemented within the scoe of a task) and how they may, should or shouldn't impact the PRs.

We would also clarify on a few "good" or "bad" changes here.

Coding Style

Follow recommendations for the given language but don't focus on superficial style -- if it's not enforced by the automated linter then it's not a rule.

Developers and reviewers should instead focus on fixing the linter (maybe later) so that it conforms to the expected rules but if the PR is passing linter then it's fine as it is.

Linter/automatic-code-formatter rules should enforce globally line length, indentation, line wrapping style, operators, multi-line arguments, whitespace/newlines, string formatting, string quoting style, import style, trailing commas and alike so we won't discuss any of it here.

IMPORTANT: If the linter is picking a file retroactively (i.e. new rules activated since the last time it changed) then developers must submit linter changes first in a separate branch as first step to help the reviewers, keep unrelated changes out of the main branch, and to avoid CI failures if the change gets reverted.

Design

Does this code make sense in the context of the system? Is it backed by an RFC? If so, is it conforming to the decisions signed-off there?

It's the job of the developers to point reviewers to all the useful resources that help understand the design and how to test it, if applicable.

Requirements

Is the branch following its description and intentions? If there's a ticket related to it you should review it to understand the full context of what's proposed. Be as thorough as you can understanding the objective to provide useful input in places where you think the change may have diverged from its focus.

For some changes (e.g. UI) a screenshot or small video may be useful to give a fuller picture. Developers should provide them as they see fit.

Naming

Is the naming of things descriptive enough?

Function names, variable names, and filenames should be descriptive and contextual (e.g. packages_to_install can be just packages once control passes to def install(packages));

Generic names are OK in small blocks of code, but "ids = set()" quickly becomes meaningless in large blocks of code.

The bigger the block of code, the more descriptive the name should be. (But that code should probably be broken up at that point anyway.)

Eschew abbreviations in large blocks of code but do use them with care or where encouraged by language best practices. In particular, be consistent. Also don't go overboard with extensively long names.

Testing

Tests are as important as the production code, so they must follow the same rules and be maintainable.

Important things:

  • The code should be easily testable.
  • Discourage use of mocking. Fakes and stubs with a specific interface are fine. Patching and mocks with an arbitrary interface are likely the result of badly designed code.
  • Arguments are friends, use them to reduce dependency on global state.
  • Interfaces are friends, you can replace them with dummy implementations at test time.
  • Double check the code is not flaky.
  • Ensure that tests are covering all the proposed changes.
  • Ensure tests focus on a specific behaviour so they don’t test too much in one go and don’t fail for unexpected reasons.
  • Ensure tests are solid, i.e. will start breaking if the logic changes.
  • Ensure tests are not only exercising the happy paths but most importantly the failure modes. Errors are as important as happy paths.
  • Test should be simple to understand. Discourage use of global fixtures, inheritance, etc.

Documentation

Should the change include documentation changes (e.g. the README, project docs) as well? Are the comments and docstrings surrounding the affected code updated accordingly? Should the change include or delete docs or comments?

If docs are not synced accordingly, ask for it.

Take algorithmic complexity & time/space performance considerations into your review

Be pragmatic, but consider the following points. Depending on how frequently code is used, who & what it affects, etc, non-performant code may be good enough (until it isn’t, and then it should be optimized).

  • What are the memory bounds for the uses of this functionality? Is the memory usage increasing without limits, could a big request consume all the workers available resources?
  • Can the queries be written in an optimized way? Are they stable enough? Are there tests ensuring this?
  • Is the data modeling, e.g. Django models/managers, helping with queries or are we forcing it?
  • Time and space complexity/consistency, with tests that explain the characteristics where possible: RAM overhead, Poor query performance, N+1 queries, O(N) algorithms.

Security

Automated CI checks can only spot common issues. Watch out for the following during code review:

  • Missing authentication/authorization
  • Missing data validation
  • Incorrect use of cryptography
  • Poor error handling
  • Logging & metrics to support auditing and analysis
  • Logging of sensitive and/or identifying data
  • Hard-coded passphrases, keys, etc.
  • Poor passphrase/key management (storage, lifetime, rotation, etc)
  • etc

If in doubt, especially around cryptography, seek expert advice!

See also:

Approval with minor comments

It's OK to give an approval when minor things are still pending of fixing (e.g. typos). Consider prefixing your comment with "nit", "typo", "optional", etc to make your expectations clear.

Another review is necessary if anything non-trivial changes after approval.

Workflow of a Pull Request for developers

  1. Review it yourself. Ensure the PR's description is useful and accurate. Consider adding comments to help reviewers.
  2. Make sure the PR's title contains the corresponding Jira ticket reference, following the format "[XXXX-0000] PR title".
  3. A developer selects 1-n reviewers, using their best judgement on who could provide useful input.
  4. The review must include code-owners. Check the project's ./CODEOWNERS file for details.
  5. Developers need at least one code-owner approval for the PR to be considered for merge. Check the project's ./CONTRIBUTING.md file for additional requirements. For non-purely technical changes, extra bureaucracy may be necessary even after approval before merging.
  6. Be polite and try to address all reviewer comments even after you got some approvals.

Alternatives

  • Keep it as is and get disparate review criteria in PRs.
  • Use some other criteria for code reviews.

Caveats

None, as long as we agree on these as sensible rules to follow to ensure code health.

Operation

All parties involved need to have access to Bitbucket and Github.

Security Impact

We should get better output from our reviews and potentially increase the quality of code also in terms of security.

Performance Impact

This directly impacts developer performance for the better, see notes in the next section.

Developer Impact

  • Quality of code will generally improve.
  • Developer Performance will improve as the quality of the reviews increases and the amount of re-work decreases.
  • Newcomers will know the expectations when submitting or reviewing changes.

Data Consumer Impact

N/A

Deployment

Once agreed we must communicate the new flow to all parties and we can start going this way.

Dependencies

None

References