How does AWS DMS use memory for migration?

To Nha Notes | Oct. 19, 2022, 10:30 a.m.

An AWS DMS replication instance uses memory to run the replication engine. This engine is responsible for running SELECT statements on the source engine during the full load phase. Also, the replication engine reads from the source engine's transaction log during the change data capture (CDC) phase. These records are migrated to the target, and then compared against the corresponding records on the target database as part of the validation process. This is how generic migration flow works in AWS DMS.

Tasks with limited LOB settings

When you migrate data using an AWS DMS task with limited LOB settings, memory is allocated in advance based on the LobMaxSize for each LOB column. If you set this value too high, then your task might fail. This fail happens due to an Out of Memory (OOM) error, depending on the number of records that you are migrating and the CommitRate.

So, if you configure your task with high values, make sure that the AWS DMS instance has enough memory.

{
  "TargetMetadata": {
  "SupportLobs": true,
  "FullLobMode": false,
  "LobChunkSize": 0,
  "LimitedSizeLobMode": true,
  "LobMaxSize": 63,
  "InlineLobMaxSize": 0,
  }
Tasks with ValidationEnabled

"ValidationSettings": { "EnableValidation": true, "ThreadCount": 5, "PartitionSize": 10000, "ValidationOnly": false, "SkipLobColumns": false, },

Tasks with parallel threads in full load and CDC phases

{ "TargetMetadata": { "ParallelLoadThreads": 0, "ParallelLoadBufferSize": 0, "ParallelLoadQueuesPerThread": 0, "ParallelApplyThreads": 0, "ParallelApplyBufferSize": 0, "ParallelApplyQueuesPerThread": 0 },

Tasks with batch apply settings

{ "TargetMetadata": { "BatchApplyEnabled": false, }, }, "ChangeProcessingTuning": { "BatchApplyPreserveTransaction": true, "BatchApplyTimeoutMin": 1, "BatchApplyTimeoutMax": 30, "BatchApplyMemoryLimit": 500, "BatchSplitSize": 0, },

Other memory-related task settings
  • During CDC, MinTransactionSize determines how many changes happen in each transaction. The size of transactions on the replication instance is controlled by MemorylimitTotal. Use this setting when you run multiple CDC tasks that need a lot of memory. Be sure to apportion this setting based on each task's transactional workload.
  • Set MemoryKeepTime to limit the memory that is consumed by long-running transactions on the source. Or, if large batch of INSERT or UPDATE statements are running on the source, then increase this time. Increase this time to retain the changes from processing in the net changes table.
  • Set StatementCacheSize to control the number of prepared statements that are stored on the replication instance.
  • If your AWS DMS replication instance contains a large volume of free memory, then tune the settings in this example. This means that AWS DMS handles the workload in memory itself, rather flushing frequently to AWS DMS storage.

"ChangeProcessingTuning": {
    "MinTransactionSize": 1000,
    "CommitTimeout": 1,
    "MemoryLimitTotal": 1024,
    "MemoryKeepTime":
  60,
    "StatementCacheSize": 50
  },
References

https://aws.amazon.com/premiumsupport/knowledge-center/dms-memory-optimization/

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TaskSettings.ChangeProcessingTuning.html