Collapsing Django's Migrations
Long-running Django projects can start to create a lot of migrations. After just a few years, an actively developed project can create thousands of them! This can put a serious dent in your test running, because (a) Django runs the migrations at test time to setup your database, and (b) you can't test your migrations unless you're happy to have 30min CI runs!
Migrations can also often be a painful source of technical debt, since they sometimes import libraries that you don't use anymore, but can't remove because someone, some day will try to run manage.py migrate
from scratch only to have it blow up looking for a dependency you don't actually use anymore.
So, looking down the barrel of a performance, tech debt, and stability headache, it's a good idea to pay some attention to your migrations from time to time.
Option 1: squashmigrations
This is the official advice. You run this command and Django will effectively squash all of your existing migrations into one Great Big Migration, so where you before you had:
0001_initial.py
0002_something.py
...
0132_something_else.py
you now have:
0001_squashed_0132_something_else.py
This is pretty slick, because it doesn't actually need any database changes. You're just merging the administrative overhead of 132 files into one and clawing back some of the performance you lost with having so many files.
There's not much more happening here though. Any old migrations you might have that depended on old_module_you_dont_use_anymore
are still in that Great Big File, including the import, and the compute overhead of processing that migration doesn't really go away (though there are optimisations that Django says can sometimes cause problems). There's also the risk of a CircularDependencyError
which is no fun to fix.
Personally, I find this process high-risk, high-effort, low-reward, so I chose a more drastic, simpler path.
Option 2: Collapse migrations
There's nothing magic or automated about this process. It's very manual, but it's also not terribly complicated.
1. Prepare
Make sure your production environment is up-to-date, and freeze any concurrent development that may involve migrations. Theoretically, you can still continue to deploy changes to production while this is happening, but I wouldn't recommend it.
If you have testing and/or staging environments, do the same there too.
On your local machine, switch your environment to master
and pull any updates so you definitely have the very same code that's in production. Start up your environment, and if you've got a snapshot of production, you should use that now.
2. Local file changes
Delete all migrations, but not the __init__.py
in each migrations
folder:
rm */migrations/0*
Next, run manage.py makemigrations
on your laptop. This will create a bunch of initial migrations, one for each app (though in some cases where there are foreign keys between apps, there may be two or three).
3. The scary part
The sticking point of all of this is that Django maintains a history of migrations in its django_migrations
table, and step 2 above knocked our file structure out of sync with that table. You can't deploy anything until that sync is restored.
On your local environment, hop into your database and delete all migrations:
DELETE FROM django_migrations;
Then hop out of your database and run:
$ python manage.py migrate --fake
This should re-populate your django_migrations
table with the "new history". The thing to remember is that you're not actually changing anything here. All of these migrations have already been applied, so you're just rewriting history to throw out the intermediate steps.
Now test that this all works. Shut down your environment, wipe your local database and spin it back up. Run your test suite and bask in the heroic speed improvement your efforts have won you. Try creating a new migration, running it, and rolling it back. When you're happy with the result, do the same on production.
And that's it! You can remove those old libraries you don't need anymore now, and add that migration testing you've been meaning to include in your CI. Future developers won't know to thank you for saving them the time it initially took to stand everything up, and everyone will get stuff done faster.
Comments
Post a Comment
Preview