1 <!-- doc/src/sgml/replication-origins.sgml -->
2 <chapter id="replication-origins">
3 <title>Replication Progress Tracking</title>
4 <indexterm zone="replication-origins">
5 <primary>Replication Progress Tracking</primary>
7 <indexterm zone="replication-origins">
8 <primary>Replication Origins</primary>
12 Replication origins are intended to make it easier to implement
13 logical replication solutions on top
14 of <xref linkend="logicaldecoding">. They provide a solution to two
17 <listitem><para>How to safely keep track of replication progress</para></listitem>
18 <listitem><para>How to change replication behavior, based on the
19 origin of a row; e.g. to avoid loops in bi-directional replication
20 setups</para></listitem>
25 Replication origins consist out of a name and a oid. The name, which
26 is what should be used to refer to the origin across systems, is
27 free-form text. It should be used in a way that makes conflicts
28 between replication origins created by different replication
29 solutions unlikely; e.g. by prefixing the replication solution's
30 name to it. The oid is used only to avoid having to store the long
31 version in situations where space efficiency is important. It should
32 never be shared between systems.
36 Replication origins can be created using the
37 <link linkend="pg-replication-origin-create"><function>pg_replication_origin_create()</function></link>;
39 <link linkend="pg-replication-origin-drop"><function>pg_replication_origin_drop()</function></link>;
41 <link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link>
46 When replicating from one system to another (independent of the fact that
47 those two might be in the same cluster, or even same database) one
48 nontrivial part of building a replication solution is to keep track of
49 replay progress in a safe manner. When the applying process, or the whole
50 cluster, dies, it needs to be possible to find out up to where data has
51 successfully been replicated. Naive solutions to this like updating a row in
52 a table for every replayed transaction have problems like runtime overhead
57 Using the replication origin infrastructure a session can be
58 marked as replaying from a remote node (using the
59 <link linkend="pg-replication-origin-session-setup"><function>pg_replication_origin_session_setup()</function></link>
60 function. Additionally the <acronym>LSN</acronym> and commit
61 timestamp of every source transaction can be configured on a per
62 transaction basis using
63 <link linkend="pg-replication-origin-xact-setup"><function>pg_replication_origin_xact-setup()</function></link>.
64 If that's done replication progress will be persist in a crash safe
65 manner. Replay progress for all replication origins can be seen in the
66 <link linkend="catalog-pg-replication-origin-status">
67 <structname>pg_replication_origin_status</structname>
68 </link> view. A individual origin's progress, e.g. when resuming
69 replication, can be acquired using
70 <link linkend="pg-replication-origin-progress"><function>pg_replication_origin_progress()</function></link>
72 <link linkend="pg-replication-origin-session-progress"><function>pg_replication_origin_session_progress()</function></link>
73 for the origin configured in the current session.
77 In more complex replication topologies than replication from exactly one
78 system to one other, another problem can be that, that it is hard to avoid
79 replicating replayed rows again. That can lead both to cycles in the
80 replication and inefficiencies. Replication origins provide a optional
81 mechanism to recognize and prevent that. When configured using the functions
82 referenced in the previous paragraph, every change and transaction passed to
83 output plugin callbacks (see <xref linkend="logicaldecoding-output-plugin">)
84 generated by the session is tagged with the replication origin of the
85 generating session. This allows to treat them differently in the output
86 plugin, e.g. ignoring all but locally originating rows. Additionally
87 the <link linkend="logicaldecoding-output-plugin-filter-by-origin">
88 <function>filter_by_origin_cb</function></link> callback can be used
89 to filter the logical decoding change stream based on the
90 source. While less flexible, filtering via that callback is
91 considerably more efficient.