1 <!-- doc/src/sgml/logical-replication.sgml -->
3 <chapter id="logical-replication">
4 <title>Logical Replication</title>
7 Logical replication is a method of replicating data objects and their
8 changes, based upon their replication identity (usually a primary key). We
9 use the term logical in contrast to physical replication, which uses exact
10 block addresses and byte-by-byte replication. PostgreSQL supports both
11 mechanisms concurrently, see <xref linkend="high-availability">. Logical
12 replication allows fine-grained control over both data replication and
17 Logical replication uses a <firstterm>publish</firstterm>
18 and <firstterm>subscribe</firstterm> model with one or
19 more <firstterm>subscribers</firstterm> subscribing to one or more
20 <firstterm>publications</firstterm> on a <firstterm>publisher</firstterm>
21 node. Subscribers pull data from the publications they subscribe to and may
22 subsequently re-publish data to allow cascading replication or more complex
27 Logical replication sends changes on the publisher to the subscriber as
28 they occur in real-time. The subscriber applies the data in the same order
29 as the publisher so that transactional consistency is guaranteed for
30 publications within a single subscription. This method of data replication
31 is sometimes referred to as transactional replication.
35 The typical use-cases for logical replication are:
40 Sending incremental changes in a single database or a subset of a
41 database to subscribers as they occur.
47 Firing triggers for individual changes as they arrive on the
54 Consolidating multiple databases into a single one (for example for
61 Replicating between different major versions of PostgreSQL.
67 Giving access to replicated data to different groups of users.
73 Sharing a subset of the database between multiple databases.
80 The subscriber database behaves in the same way as any other PostgreSQL
81 instance and can be used as a publisher for other databases by defining its
82 own publications. When the subscriber is treated as read-only by
83 application, there will be no conflicts from a single subscription. On the
84 other hand, if there are other writes done either by an application or by other
85 subscribers to the same set of tables, conflicts can arise.
88 <sect1 id="logical-replication-publication">
89 <title>Publication</title>
92 A <firstterm>publication</firstterm> can be defined on any physical
93 replication master. The node where a publication is defined is referred to
94 as <firstterm>publisher</firstterm>. A publication is a set of changes
95 generated from a table or a group of tables, and might also be described as
96 a change set or replication set. Each publication exists in only one database.
100 Publications are different from schemas and do not affect how the table is
101 accessed. Each table can be added to multiple publications if needed.
102 Publications may currently only contain tables. Objects must be added
103 explicitly, except when a publication is created for <literal>ALL
108 Publications can choose to limit the changes they produce to
109 any combination of <command>INSERT</command>, <command>UPDATE</command>, and
110 <command>DELETE</command>, similar to how triggers are fired by
111 particular event types. If a table without a <literal>REPLICA
112 IDENTITY</literal> is added to a publication that
113 replicates <command>UPDATE</command> or <command>DELETE</command>
114 operations then subsequent <command>UPDATE</command>
115 or <command>DELETE</command> operations will fail on the publisher.
119 Every publication can have multiple subscribers.
123 A publication is created using the <xref linkend="sql-createpublication">
124 command and may later be altered or dropped using corresponding commands.
128 The individual tables can be added and removed dynamically using
129 <xref linkend="sql-alterpublication">. Both the <literal>ADD
130 TABLE</literal> and <literal>DROP TABLE</literal> operations are
131 transactional; so the table will start or stop replicating at the correct
132 snapshot once the transaction has committed.
136 <sect1 id="logical-replication-subscription">
137 <title>Subscription</title>
140 A <firstterm>subscription</firstterm> is the downstream side of logical
141 replication. The node where a subscription is defined is referred to as
142 the <firstterm>subscriber</firstterm>. A subscription defines the connection
143 to another database and set of publications (one or more) to which it wants
148 The subscriber database behaves in the same way as any other PostgreSQL
149 instance and can be used as a publisher for other databases by defining its
154 A subscriber node may have multiple subscriptions if desired. It is
155 possible to define multiple subscriptions between a single
156 publisher-subscriber pair, in which case care must be taken to ensure
157 that the subscribed publication objects don't overlap.
161 Each subscription will receive changes via one replication slot (see
162 <xref linkend="streaming-replication-slots">).
166 Subscriptions are not dumped by <command>pg_dump</command> by default, but
167 this can be requested using the command-line
168 option <option>--include-subscriptions</option>.
172 The subscription is added using <xref linkend="sql-createsubscription"> and
173 can be stopped/resumed at any time using the
174 <xref linkend="sql-altersubscription"> command and removed using
175 <xref linkend="sql-dropsubscription">.
179 When a subscription is dropped and recreated, the synchronization
180 information is lost. This means that the data has to be resynchronized
185 The schema definitions are not replicated, and the published tables must
186 exist on the subscriber. Only regular tables may be
187 the target of replication. For example, you can't replicate to a view.
191 The tables are matched between the publisher and the subscriber using the
192 fully qualified table name. Replication to differently-named tables on the
193 subscriber is not supported.
197 Columns of a table are also matched by name. A different order of columns
198 in the target table is allowed, but the column types have to match.
202 <sect1 id="logical-replication-conflicts">
203 <title>Conflicts</title>
206 Logical replication behaves similarly to normal DML operations in that
207 the data will be updated even if it was changed locally on the subscriber
208 node. If incoming data violates any constraints the replication will
209 stop. This is referred to as a <firstterm>conflict</firstterm>. When
210 replicating <command>UPDATE</command> or <command>DELETE</command>
211 operations, missing data will not produce a conflict and such operations
212 will simply be skipped.
216 A conflict will produce an error and will stop the replication; it must be
217 resolved manually by the user. Details about the conflict can be found in
218 the subscriber's server log.
222 The resolution can be done either by changing data on the subscriber so
223 that it does not conflict with the incoming change or by skipping the
224 transaction that conflicts with the existing data. The transaction can be
225 skipped by calling the <link linkend="pg-replication-origin-advance">
226 <function>pg_replication_origin_advance()</function></link> function with
227 a <parameter>node_name</parameter> corresponding to the subscription name,
228 and a position. The current position of origins can be seen in the
229 <link linkend="view-pg-replication-origin-status">
230 <structname>pg_replication_origin_status</structname></link> system view.
234 <sect1 id="logical-replication-architecture">
235 <title>Architecture</title>
238 Logical replication starts by copying a snapshot of the data on the
239 publisher database. Once that is done, changes on the publisher are sent
240 to the subscriber as they occur in real time. The subscriber applies data
241 in the order in which commits were made on the publisher so that
242 transactional consistency is guaranteed for the publications within any
247 Logical replication is built with an architecture similar to physical
248 streaming replication (see <xref linkend="streaming-replication">). It is
249 implemented by <quote>walsender</quote> and <quote>apply</quote>
250 processes. The walsender process starts logical decoding (described
251 in <xref linkend="logicaldecoding">) of the WAL and loads the standard
252 logical decoding plugin (pgoutput). The plugin transforms the changes read
253 from WAL to the logical replication protocol
254 (see <xref linkend="protocol-logical-replication">) and filters the data
255 according to the publication specification. The data is then continuously
256 transferred using the streaming replication protocol to the apply worker,
257 which maps the data to local tables and applies the individual changes as
258 they are received, in correct transactional order.
262 The apply process on the subscriber database always runs with
263 <varname>session_replication_role</varname> set
264 to <literal>replica</literal>, which produces the usual effects on triggers
269 <sect1 id="logical-replication-monitoring">
270 <title>Monitoring</title>
273 Because logical replication is based on a similar architecture as
274 <link linkend="streaming-replication">physical streaming replication</link>,
275 the monitoring on a publication node is similar to monitoring of a
276 physical replication master
277 (see <xref linkend="streaming-replication-monitoring">).
281 The monitoring information about subscription is visible in
282 <link linkend="pg-stat-subscription"><literal>pg_stat_subscription</literal></link>.
283 This view contains one row for every subscription worker. A subscription
284 can have zero or more active subscription workers depending on its state.
288 Normally, there is a single apply process running for an enabled
289 subscription. A disabled subscription or a crashed subscription will have
290 zero rows in this view.
294 <sect1 id="logical-replication-security">
295 <title>Security</title>
298 The role used for the replication connection must have
299 the <literal>REPLICATION</literal> attribute. Access for the role must be
300 configured in <filename>pg_hba.conf</filename>.
304 To create a publication, the user must have the <literal>CREATE</literal>
305 privilege in the database.
309 To add tables to a publication, the user must have ownership rights on the
310 table. To create a publication that publishes all tables automatically,
311 the user must be a superuser.
315 To create a subscription, the user must be a superuser.
319 The subscription apply process will run in the local database with the
320 privileges of a superuser.
324 Privileges are only checked once at the start of a replication connection.
325 They are not re-checked as each change record is read from the publisher,
326 nor are they re-checked for each change when applied.
330 <sect1 id="logical-replication-config">
331 <title>Configuration Settings</title>
334 Logical replication requires several configuration options to be set.
338 On the publisher side, <varname>wal_level</varname> must be set to
339 <literal>logical</literal>, and <varname>max_replication_slots</varname>
340 must be set to at least the number of subscriptions expected to connect.
341 And <varname>max_wal_senders</varname> should be set to at least the same
342 as <varname>max_replication_slots</varname> plus the number of physical replicas
343 that are connected at the same time.
347 The subscriber also requires the <varname>max_replication_slots</varname>
348 to be set. In this case it should be set to at least the number of
349 subscriptions that will be added to the subscriber.
350 <varname>max_logical_replication_workers</varname> must be set to at
351 least the number of subscriptions. Additionally the
352 <varname>max_worker_processes</varname> may need to be adjusted to
353 accommodate for replication workers, at least
354 (<varname>max_logical_replication_workers</varname>
355 + <literal>1</literal>). Note that some extensions and parallel queries
356 also take worker slots from <varname>max_worker_processes</varname>.
360 <sect1 id="logical-replication-quick-setup">
361 <title>Quick Setup</title>
364 First set the configuration options in <filename>postgresql.conf</filename>:
368 The other required settings have default values that are sufficient for a
373 <filename>pg_hba.conf</filename> needs to be adjusted to allow replication
374 (the values here depend on your actual network configuration and user you
375 want to use for connecting):
377 host all repuser 0.0.0.0/0 md5
382 Then on the publisher database:
384 CREATE PUBLICATION mypub FOR TABLE users, departments;
389 And on the subscriber database:
391 CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub;
396 The above will start the replication process of changes to
397 <literal>users</literal> and <literal>departments</literal> tables.