Solution to execute DB migration into multi-database using db-router of Django Framework.

To Nha Notes | Feb. 2, 2021, 5:21 p.m.

1. How to implement to use multi-DB

Suppose, our project is implemented by Django framework. Then, we want to migrate a few large tables from current DB to new one with alias named shard1.

Create an app for routing migrations specific for new DB.
python manage.py startapp shard1

Remove non-use files in created app:

rm -rf views.py models.py admin.py
Django settings
DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.mysql",
        "NAME": os.environ.get("RDS_DB_NAME"),
        ...
    },
    "shard1": {
        "ENGINE": "django.db.backends.mysql",
        "NAME": "shard1", 
        ...
    }        
}
DATABASE_ROUTERS = ["<project_name>.db_routing.Shard1Router"]
Implement routing file <project_name>.db_routing.py

SHARD1_DB_ALIAS = "shard1"
MASTER_MODEL_NAMES = {
    "common_adminuser",
    "analytics_segmenttag",
    "analytics_segmenttaggroup",
    "analytics_transitionpoint",
}
SHARD1_ROUTE_MODEL_NAMES = {
    "rtchatmessage",
    "rtchatmessagelog",
    "rtchatstatus",
    "endusersegmenttag",
    "enduseractivitytag",
    "endusertransitionpointtag",
}


class Shard1Router:
    """
    A router to control all database operations on models migrated to Shard1.
    """

    route_model_names = SHARD1_ROUTE_MODEL_NAMES
    migrate_model_names = list(MASTER_MODEL_NAMES) + list(SHARD1_ROUTE_MODEL_NAMES)

    def db_for_read(self, model, **hints):
        print(model._meta.model_name)
        if model._meta.model_name in self.route_model_names:
            return SHARD1_DB_ALIAS
        return None

    def db_for_write(self, model, **hints):
        if model._meta.model_name in self.route_model_names:
            return SHARD1_DB_ALIAS
        return None

    def allow_relation(self, obj1, obj2, **hints):
        return True

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        if app_label == SHARD1_DB_ALIAS:
            return db == SHARD1_DB_ALIAS
        return True

The MASTER_MODEL_NAMES are tables which we want to keep in both old and new DB. And The SHARD1_ROUTE_MODEL_NAMES are list of tables specific for new DB. In some case, the query need tables of both list. This is reason that we need to keep MASTER_MODEL_NAMES in both DBs.

Because the limitation of Django router, we also need to bring below tables into new DB once.

django_migrations
django_content_type
django_site
auth_group
auth_permission
auth_group_permissions

In case, the model changes is related to tables in SHARD1_ROUTE_MODEL_NAMES or MASTER_MODEL_NAMES, we need to follow below instruction to make migrations.

To make migrations for old DB, we do make migrations normally as below:

python manage.py makemigrations

Say, output is

Migrations for 'myapp':
  myapp/migrations/0002_add_column_name.py

To make migration specifics for new DB.

python manage.py makemigrations shard1 --name=add_column_name --empty

Say, Its output is

Migrations for 'shard1':
  shard1/migrations/0002_add_column_name.py

Pickup raw SQL of above generated file 0002_add_column_name.py via below command.

python manage.py sqlmigrate myapp 0002

Say, Its output is

BEGIN;
--
-- Add field name to mytable
--
ALTER TABLE `myapp_mytable` ADD COLUMN `name` varchar(100) DEFAULT NULL;
COMMIT;

Customize created migration file shard1/migrations/0002_add_column_name.py by above raw SQL.

from django.db import migrations


class Migration(migrations.Migration):

    dependencies = [
        ('shard1', '0001_initial'),
    ]

    operations = [
        migrations.RunSQL(
            """
            ALTER TABLE `myapp_mytable` ADD COLUMN `name` varchar(100) DEFAULT NULL;
            """
        ),
    ]

Note: If the changes is related tables in both old and new DB, we should separate it to separate migration file before following above migration instruction.

Execute migration commands on deployment:
python manage.py migrate
python manage.py migrate shard1 --database=shard1