Django 1.11 upgrade
This document explains the plan and motivations to pursue the Django upgrade to version 1.11 which is the latest Django version that supports Python 2.
Problem Description
Our Django version is really old (released September 2nd 2014). This does not only bring our dependencies to insecure old age versions but is also forcing us to hit and workaround bugs while preventing us to use new features from the framework that would be useful to improve our platform scale.
Going to Django 2.x or even 3.x is a bit more convoluted because it'd require us to upgrade to Python 3 which is not an easy feat with the current state of the code-base, although we'll be setting the basics in this upgrade so we can plan for this in the future.
Note that while Django 1.11 is now unsupported, the support has dropped on April 2020, so it's still a well maintained version even if not perfect.
Background
Our current Django version is starting to show its cracks, here's a non-extensive list of limitations hitting us today:
- Tradelog optimizations are not available because of ORM limitations
- Manually handled partial indexes:
- Manually backported code to mimic Django upgraded migrations logic:
- Non-deterministic crash in django.db.models.sql.Query.combine():
- Workaround because of missing aggregation functions like Concat:
- Bulk creation does not set PKs on created objects:
- Choice from a callable is not available:
- prefetch_related_objects is not available:
More over there are a lot of good stuff we could start using after the upgrade:
- Full text search for
Postgres: Useful in our quest
to get rid of ElasticSearch. We have backported some of this in
core/lookups.pybut none of the fancy features. - Bulk inserts return IDs: We have few places where doing bulk inserts and re-loading the objects manually.
- Subquery() expressions: Useful in a lot of places but particularly for re-introducing tradelog and any brokerdeal related query optimizations.
- Performing actions after a transaction commit: We can drop the django-transaction-hooks external dependency.
- Enhanced date querying: for any reporting requiring date handling and output.
- Password validators: To enforce the security and robustness of our passwords globally.
- Query Expressions, Conditional Expressions, and Database Functions: Allowing us to access database functions, conditionals, references and arithmetic.
- Support for Postgres specific features: Allowing us to use fine grained features like specialized types for data.
- Template-based widget rendering: Allow fine grained customization of forms and widgets.
- Model level indexes: Allows defining fine tuned indexes for our models for queries that spawn several fields.
Solution
We'll upgrade to Django 1.11 from 1.7 without intermediaries if possible. In order to do this we'll set the Django version to 1.11 and start resolving library conflicts from there. The reason to do this is because our test coverage is not as good as we'd like, so in order to keep the manual testing process as little as possible we'd like to not repeat it from minor version to minor version.
In order to help us cover more views in the upgrade process we may instrument instant-coverage as a measure to ensure the basic workings of all views.
Tentative upgrade plan
- Squash migrations in all apps:
- Run the
squashmigrationscommand for all apps. - Push and deploy code so that squashed migrations apply while old ones still exist, this is the recommended migration path for squashing
- Delete old migrations, leaving just the squashed versions around.
- Redeploy. All this work will happen on the mainline branch. There may be new migrations created once we upgrade and we may have to squash again but the field would look way cleaner than it does today.
- Run the
- Upgrade to Django 1.11 and adjust all dependencies in a feature branch. For this we'll rely on pip version checking as we’ll need to amend hard-coded dependencies manually until things are happy.
- Assess backwards incompatible changes, see what can backport into our 1.7 codebase before the upgrade:
- The release notes for all versions contain the deprecated removed stuff addressable by using the forward compatible approaches. An example of this are the middle-ware classes.
- For forced major version library upgrades we’ll have to check the APIs and documentation manually, unfortunately for these ones there are no shortcuts.
- We somewhat trust our test suite to exercise a good chunk of our codebase, we may add more tests where looks needed, coverage reports help on this.
- Augment our test coverage as needed, instrumenting instant-coverage may be useful.
- Apply fixes repeatedly, merging backwards compatible stuff to mainline and re-squashing migrations as necessary.
- Deploy on staging and start manual testing.
- Merge to mainline, deploy, cross fingers.
Tentative post upgrade tasks to cash-in the upgrade
- Re-introduce tradelog/brokerdeal optimizations with a subquery.
- Drop all custom migration logic (will ease Django 2.x upgrade).
- Drop django-transaction-hooks (will ease Django 2.x upgrade).
- Introduce pipeline checks to enforce Python 3 compatibility in our new code (ground work to Python 3 migration).
- Simplify bulk creation now that ids return as part of the operation.
- Drop ElasticSearch uses for full-text search capabilities now available at the ORM.
- Introduce a GenericRelation queryset decorator that allows pre-fetching non-homogeneous results.
- Use of query functions like Concat where applicable.
- Enable password validation rules to enforce increased security in our user credentials.
Caveats
It's a sensitive process that requires a lot of effort and care. Fortunately this is a point-release upgrade where most of the important internals have settled, we expect some friction but not the same as seen with prior upgrades for versions older than 1.5.
Operation
We'll need to deploy a feature branch into staging for manual testing for a while before merging, the capabilities for this are in place so should be a non-problem. Everything else regarding release should remain unchanged.
Security Impact
General security will improve as our libraries get more up to date. Also the password validation rules have improved.
Performance Impact
General performance of the framework should improve as the ORM starts to expose database engine specific features to optimize queries. At the same time the new features allow us to be more efficient at performing common tasks.
Developer Impact
Developer experience will improve since:
- Official published documentation is available online in the official site.
- Added features in the ORM solve problems that were impossible/hard to solve.
- Most questions, references and reports target newer Django version APIs, so more online help is available.
- Internal ORM APIs are pretty close to Django 2.x meaning the migration or backporting of features should be possible.
- More libraries and tooling from the Django ecosystem available since Django 1.11 support continued until recently.
Deployment
No especial instructions for deployment required, we'll use the standard pipeline to push this upgrade forward once ensured the quality and stability of the upgrade.
Dependencies
We'll need to coordinate a bit with teams and let them be aware of incompatible changes that may introduce while we are working at it. Having the least amount of possible migrations would also help.