</para>
<para>
- Logical replication sends changes on the publisher to the subscriber as
- they occur in real-time. The subscriber applies the data in the same order
- as the publisher so that transactional consistency is guaranteed for
+ Logical replication of a table typically starts with taking a snapshot
+ of the data on the publisher database and copying that to the subscriber.
+ Once that is done, the changes on the publisher are sent to the subscriber
+ as they occur in real-time. The subscriber applies the data in the same
+ order as the publisher so that transactional consistency is guaranteed for
publications within a single subscription. This method of data replication
is sometimes referred to as transactional replication.
</para>
<para>
Each subscription will receive changes via one replication slot (see
- <xref linkend="streaming-replication-slots">).
+ <xref linkend="streaming-replication-slots">). Additional temporary
+ replication slots may be required for the initial data synchronization
+ of pre-existing table data.
</para>
<para>
- Subscriptions are not dumped by <command>pg_dump</command> by default, but
- this can be requested using the command-line
- option <option>--include-subscriptions</option>.
+ Subscriptions are dumped by <command>pg_dump</command> if the current user
+ is a superuser. Otherwise a warning is written and subscriptions are
+ skipped, because non-superusers cannot read all subscription information
+ from the <structname>pg_subscription</structname> catalog.
</para>
<para>
<para>
Columns of a table are also matched by name. A different order of columns
- in the target table is allowed, but the column types have to match.
- </para>
+ in the target table is allowed, but the column types have to match. The
+ target table can have additional columns not provided by the published
+ table. Those will be filled with their default values.
+ </para>
+
+ <sect2 id="logical-replication-subscription-slot">
+ <title>Replication Slot Management</title>
+
+ <para>
+ As mentioned earlier, each (active) subscription receives changes from a
+ replication slot on the remote (publishing) side. Normally, the remote
+ replication slot is created automatically when the subscription is created
+ using <command>CREATE SUBSCRIPTION</command> and it is dropped
+ automatically when the subscription is dropped using <command>DROP
+ SUBSCRIPTION</command>. In some situations, however, it can be useful or
+ necessary to manipulate the subscription and the underlying replication
+ slot separately. Here are some scenarios:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ When creating a subscription, the replication slot already exists. In
+ that case, the subscription can be created using
+ the <literal>create_slot = false</literal> option to associate with the
+ existing slot.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When creating a subscription, the remote host is not reachable or in an
+ unclear state. In that case, the subscription can be created using
+ the <literal>connect = false</literal> option. The remote host will then not
+ be contacted at all. This is what <application>pg_dump</application>
+ uses. The remote replication slot will then have to be created
+ manually before the subscription can be activated.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When dropping a subscription, the replication slot should be kept.
+ This could be useful when the subscriber database is being moved to a
+ different host and will be activated from there. In that case,
+ disassociate the slot from the subscription using <command>ALTER
+ SUBSCRIPTION</command> before attempting to drop the subscription.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When dropping a subscription, the remote host is not reachable. In
+ that case, disassociate the slot from the subscription
+ using <command>ALTER SUBSCRIPTION</command> before attempting to drop
+ the subscription. If the remote database instance no longer exists, no
+ further action is then necessary. If, however, the remote database
+ instance is just unreachable, the replication slot should then be
+ dropped manually; otherwise it would continue to reserve WAL and might
+ eventually cause the disk to fill up. Such cases should be carefully
+ investigated.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect2>
</sect1>
<sect1 id="logical-replication-conflicts">
to <literal>replica</literal>, which produces the usual effects on triggers
and constraints.
</para>
+
+ <sect2 id="logical-replication-snapshot">
+ <title>Initial Snapshot</title>
+ <para>
+ The initial data in existing subscribed tables are snapshotted and
+ copied in a parallel instance of a special kind of apply process.
+ This process will create its own temporary replication slot and
+ copy the existing data. Once existing data is copied, the worker
+ enters synchronization mode, which ensures that the table is brought
+ up to a synchronized state with the main apply process by streaming
+ any changes that happened during the initial data copy using standard
+ logical replication. Once the synchronization is done, the control
+ of the replication of the table is given back to the main apply
+ process where the replication continues as normal.
+ </para>
+ </sect2>
</sect1>
- <sect1 id="logical-replication-monitoring">
+ <sect1 id="logical-replication-monitoring">
<title>Monitoring</title>
<para>
<para>
Normally, there is a single apply process running for an enabled
subscription. A disabled subscription or a crashed subscription will have
- zero rows in this view.
+ zero rows in this view. If the initial data synchronization of any
+ table is in progress, there will be additional workers for the tables
+ being synchronized.
</para>
</sect1>
<title>Security</title>
<para>
- Logical replication connections occur in the same way as with physical streaming
- replication. It requires access to be explicitly given using
- <filename>pg_hba.conf</filename>. The role used for the replication
- connection must have the <literal>REPLICATION</literal> attribute. This
- gives a role access to both logical and physical replication.
+ The role used for the replication connection must have
+ the <literal>REPLICATION</literal> attribute. Access for the role must be
+ configured in <filename>pg_hba.conf</filename>.
</para>
<para>
privilege in the database.
</para>
+ <para>
+ To add tables to a publication, the user must have ownership rights on the
+ table. To create a publication that publishes all tables automatically,
+ the user must be a superuser.
+ </para>
+
<para>
To create a subscription, the user must be a superuser.
</para>
<para>
On the publisher side, <varname>wal_level</varname> must be set to
<literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect.
- And <varname>max_wal_senders</varname> should be set to at least the same
- as <varname>max_replication_slots</varname> plus the number of physical replicas
- that are connected at the same time.
+ must be set to at least the number of subscriptions expected to connect,
+ plus some reserve for table synchronization. And
+ <varname>max_wal_senders</varname> should be set to at least the same as
+ <varname>max_replication_slots</varname> plus the number of physical
+ replicas that are connected at the same time.
</para>
<para>
to be set. In this case it should be set to at least the number of
subscriptions that will be added to the subscriber.
<varname>max_logical_replication_workers</varname> must be set to at
- least the number of subscriptions. Additionally the
- <varname>max_worker_processes</varname> may need to be adjusted to
- accommodate for replication workers, at least
+ least the number of subscriptions, again plus some reserve for the table
+ synchronization. Additionally the <varname>max_worker_processes</varname>
+ may need to be adjusted to accommodate for replication workers, at least
(<varname>max_logical_replication_workers</varname>
+ <literal>1</literal>). Note that some extensions and parallel queries
also take worker slots from <varname>max_worker_processes</varname>.
(the values here depend on your actual network configuration and user you
want to use for connecting):
<programlisting>
-host replication repuser 0.0.0.0/0 md5
+host all repuser 0.0.0.0/0 md5
</programlisting>
</para>
</para>
<para>
- The above will start the replication process of changes to
- <literal>users</literal> and <literal>departments</literal> tables.
+ The above will start the replication process, which synchronizes the
+ initial table contents of the tables <literal>users</literal> and
+ <literal>departments</literal> and then starts replicating
+ incremental changes to those tables.
</para>
</sect1>
</chapter>