Background
We were running two AWS DMS tasks to replicate data from an RDS PostgreSQL source to a target database. Both tasks were set up similarly - same replication instance, same source endpoint, Full Load + CDC migration type.
One task was humming along perfectly. The other was completely stuck in "Before Load" state. No data moving. No load happening. Just sitting there doing nothing.
The First Error
The DMS console showed this error on the failing task:
"Could not assign a postgres plugin to use for replication"
Vague. Not very helpful. We did what most people do - restarted the task, checked if the replication instance was healthy, tested the endpoint connections. Everything passed. The task still refused to move.
Digging Into CloudWatch Logs
The real breakthrough came when we enabled detailed logging and looked at CloudWatch. Filtering for ERROR lines, we found this buried in the SOURCE_CAPTURE logs:
9999-09-99T10:49:25 [SOURCE_CAPTURE] ERROR: pglogical is not in shared_preload_libraries
9999-09-99T10:49:25 [SOURCE_CAPTURE] ERROR: relation "pglogical.replication_set" does not exist
Now we had something to work with.
Understanding the Root Cause
AWS DMS needs a logical decoding plugin on the PostgreSQL source to capture changes for CDC. There are two options available:
| Plugin | Type | RDS Support |
|---|---|---|
test_decoding | Built-in to PostgreSQL | Works out of the box |
pglogical | Third-party extension | Requires manual install + parameter change + reboot |
The failing task's source endpoint had pluginName=pglogical set in its extra connection attributes — most likely carried over from a copied endpoint config at some point. Since pglogical was never actually installed or loaded on the RDS instance, the task couldn't initialize its replication slot and got stuck before even starting the load phase.
The working task was using test_decoding — which is why it had no issues.
The Fix — Step by Step
Step 1 — Update the DMS Source Endpoint
Go to DMS → Endpoints → Source Endpoint → Modify
In the Extra connection attributes field, set:
pluginName=test_decoding;heartbeatEnable=true;heartbeatFrequency=5;
The heartbeat is important for Full Load + CDC tasks — it keeps the replication slot alive during the full load phase so CDC can kick in seamlessly once load completes.
Step 2 — Drop Stale Replication Slots
The failed task leaves behind an orphaned replication slot on the database. This will block the new task from initializing. Connect to your RDS instance and check:
SELECT slot_name, plugin, active, restart_lsn FROM pg_replication_slots;
Drop any inactive DMS slots:
SELECT pg_drop_replication_slot('slot_name_here');
Do not skip this step — stale slots are a silent blocker that causes tasks to hang even after everything else is fixed.
Step 3 — Remove pglogical from RDS
Since we are no longer using pglogical, clean it up properly.
Drop the extension if it was created:
DROP EXTENSION IF EXISTS pglogical CASCADE;
Then go to RDS → Parameter Groups → Your Parameter Group and remove pglogical from shared_preload_libraries. Save the changes and reboot the RDS instance.
Verify it is gone after the reboot:
SHOW shared_preload_libraries;
Step 4 — Restart the DMS Task Clean
- Stop the task — wait until status shows Stopped
- Start the task — select "Start processing from the beginning"
- Watch CloudWatch logs for the first 2 minutes to confirm no errors
Result
Task moved from Before Load → Full Load → CDC without any issues. Data replication was running within minutes of applying the fix.
0 Comments