An update on Replicator

This is my first post on Replicator, I'm going to start by describing the terminology we use, bringing some analogies from other replication systems.

Replicator is an asynchronous master-to-multiple-slaves replication system. It works by propagating binary changes from a single read-write node (called master) to one or more read-only nodes (called slaves) through an intermediary (forwarder) process. The data changes are stored in binary transaction files on a per-transaction basis. Additionally, each file contains a list of replication tables the data belongs to. Every slave has a distinct set of tables to replicate. Finally, transaction files are addressed by a special data structure called replication queue.

After connecting to the forwarder for the first time each slave node performs an initial sync (full dump) by requesting a complete up-to-date snapshot of replication tables. The forwarder doesn't necessarily resend such request to a master process. Instead, it checks the queue for past full dumps and reuses them if appropriate.

To reduce the bandwidth and disk space consumption per each slave the set of tables replicated by the slave is compared with the set of current transaction's tables, and the forwarder decides to send a transaction to the slave only when these 2 sets intersects; thus, each slave receives data only for those tables it replicates. So far, there was an important exception to this rule: full dump transactions were always sent to every slave. The original justification for this was the fact that full dump was required after addition of a new replicated table, and every slave had to be aware of this addition.

In 1.8 we introduced a new feature called 'per-table dumps', which allowed a slave to request a snapshot of a single table, instead of requesting a full dump. Currently, when a new table is added to the replication, only a single per-table dump is requested, and there's no need for a full dump. This made possible for a slave to 'skip' a full dump, and changes committed last week implement exactly this: if the slave is in sync (i.e. doesn't wait for a dump, or recovering from an error), the forwarder just skips sending full dump transactions to this slave, therefore avoiding most bandwidth-consuming transactions. It's a clear win!

Additionally there is another related positive change. When a slave restores full dump transaction it replaces the data of each replicated table with the one from the dump, leaving the table inconsistent for some period of time (which is usually short, but depends on table size and other factors). By reducing the average number of dumps per each slave we also reduced the number of these 'inconsistency gaps'. Double win!

The next version, 1.9, is still in development. We put it (as well as other open-source projects) on github, so you are welcome to check it out and join the replicator mailing list.

Stay tuned for further updates!

An update on Replicator

Recent blogs

Part 4: Debugging Limitations in RDS and Cloud Environments

PgManage 1.3 Release: Powerful New Features and Enhanced Usability

Part 3: Why Autovacuum Stops — PostgreSQL Internal Mechanics Explained

Lessons from the Road: The weight of consumerism

Part 2: Reproducing and Diagnosing Autovacuum Failures

Lessons from the Road: Know your values

Part 1: When Autovacuum Silently Fails Across Databases

Lessons from the Road: What is Essential