Semisynchronous Replication
Contents
Description
Standard MariaDB replication is asynchronous, but MariaDB also provides a semisynchronous replication option.
With regular asynchronous replication, replicas request events from the primary's binary log whenever the replicas are ready. The primary does not wait for a replica to confirm that an event has been received.
With fully synchronous replication, all replicas are required to respond that they have received the events. See Galera Cluster.
Semisynchronous replication waits for just one replica to acknowledge that it has received and logged the events.
Semisynchronous replication therefore comes with some negative performance impact, but increased data integrity. Since the delay is based on the roundtrip time to the replica and back, this delay is minimized for servers in close proximity over fast networks.
Semisynchronous replication is built into the server. See MDEV-13073 for more information.
Enabling Semisynchronous Replication
Semisynchronous replication can be enabled by setting the relevant system variables on the primary and the replica.
If a server needs to be able to switch between acting as a primary and a replica, then you can enable both the primary and replica system variables on the server. For example, you might need to do this if MariaDB MaxScale is being used to enable auto-failover or switchover with MariaDB Monitor.
Enabling Semisynchronous Replication on the Primary
Semisynchronous replication can be enabled on the primary by setting the rpl_semi_sync_master_enabled system variable to ON
. It can be set dynamically with SET GLOBAL. For example:
SET GLOBAL rpl_semi_sync_master_enabled=ON;
It can also be set in a server option group in an option file prior to starting up the server. For example:
[mariadb] ... rpl_semi_sync_master_enabled=ON
Enabling Semisynchronous Replication on the Replica
Semisynchronous replication can be enabled on the replica by setting the rpl_semi_sync_slave_enabled system variable to ON
. It can be set dynamically with SET GLOBAL. For example:
SET GLOBAL rpl_semi_sync_slave_enabled=ON;
It can also be set in a server option group in an option file prior to starting up the server. For example:
[mariadb] ... rpl_semi_sync_slave_enabled=ON
When switching between semisynchronous replication and asynchronous replication on a replica with replica IO threads already running, the replica I/O thread will need to be restarted. For example:
STOP SLAVE IO_THREAD; START SLAVE IO_THREAD;
If this is not done, then the replica IO thread will continue to use the previous setting.
Configuring the Primary Timeout
In semisynchronous replication, only after the events have been written to the relay log and flushed does the replica acknowledge receipt of a transaction's events. If the replica does not acknowledge the transaction before a certain amount of time has passed, then a timeout occurs and the primary switches to asynchronous replication. This will be reflected in the primary's error log with messages like the following:
[Warning] Timeout waiting for reply of binlog (file: mariadb-1-bin.000002, pos: 538), semi-sync up to file , position 0. [Note] Semi-sync replication switched OFF.
When this occurs, the Rpl_semi_sync_master_status status variable will be switched to OFF
.
When at least one semisynchronous replica catches up, semisynchronous replication is resumed. This will be reflected in the primary's error log with messages like the following:
[Note] Semi-sync replication switched ON with replica (server_id: 184137206) at (mariadb-1-bin.000002, 215076)
When this occurs, the Rpl_semi_sync_master_status status variable will be switched to ON
.
The number of times that semisynchronous replication has been switched off can be checked by looking at the value of the Rpl_semi_sync_master_no_times status variable.
If you see a lot of timeouts like this in your environment, then you may want to change the timeout period. The timeout period can be changed by setting the rpl_semi_sync_master_timeout system variable. It can be set dynamically with SET GLOBAL. For example:
SET GLOBAL rpl_semi_sync_master_timeout=20000;
It can also be set in a server option group in an option file prior to starting up the server. For example:
[mariadb] ... rpl_semi_sync_master_timeout=20000
To determine a good value for the rpl_semi_sync_master_timeout system variable, you may want to look at the values of the Rpl_semi_sync_master_net_avg_wait_time and Rpl_semi_sync_master_tx_avg_wait_time status variables.
Configuring the Primary Wait Point
In semisynchronous replication, there are two potential points at which the primary can wait for the replica acknowledge the receipt of a transaction's events. These two wait points have different advantages and disadvantages.
The wait point is configured by the rpl_semi_sync_master_wait_point system variable. The supported values are:
AFTER_SYNC
AFTER_COMMIT
It can be set dynamically with SET GLOBAL. For example:
SET GLOBAL rpl_semi_sync_master_wait_point='AFTER_SYNC';
It can also be set in a server option group in an option file prior to starting up the server. For example:
[mariadb] ... rpl_semi_sync_master_wait_point=AFTER_SYNC
When this variable is set to AFTER_SYNC
, the primary performs the following steps:
- Prepares the transaction in the storage engine.
- Syncs the transaction to the binary log.
- Waits for acknowledgement from the replica.
- Commits the transaction to the storage engine.
- Returns an acknowledgement to the client.
The effects of the AFTER_SYNC
wait point are:
- All clients see the same data on the primary at the same time; after acknowledgement by the replica and after being committed to the storage engine on the primary.
- If the primary crashes, then failover should be lossless, because all transactions committed on the primary would have been replicated to the replica.
- However, if the primary crashes, then its binary log may also contain events for transactions that were prepared by the storage engine and written to the binary log, but that were never actually committed by the storage engine. As part of the server's automatic crash recovery process, the server may recover these prepared transactions when the server is restarted. This could cause the "old" crashed primary to become inconsistent with its former replicas when they have
been reconfigured to replace the old primary with a new one.
The old primary in such a scenario can be re-introduced only as a
semisync slave.
The server post-crash recovery of the server configured with
rpl_semi_sync_slave_enabled = ON
ensures through MDEV-21117 that the server will not have extra transactions. The reconfigured as semisync replica server's binlog gets truncated to discard transactions proven not to be committed, in any of their branches if they are multi-engine. Truncation does not occur though when there exists a non-transactional group of events beyond the truncation position in which case recovery reports an error. When the semisync replica recovery can't be carried out, the crashed primary may need to be rebuilt.
When this variable is set to AFTER_COMMIT
, the primary performs the following steps:
- Prepares the transaction in the storage engine.
- Syncs the transaction to the binary log.
- Commits the transaction to the storage engine.
- Waits for acknowledgement from the replica.
- Returns an acknowledgement to the client.
The effects of the AFTER_COMMIT
wait point are:
- Other clients may see the committed transaction before the committing client.
- If the primary crashes, then failover may involve some data loss, because the primary may have committed transactions that had not yet been acknowledged by the replicas.
Versions
Version | Status | Introduced |
---|---|---|
N/A | N/A | MariaDB 10.3.3 |
1.0 | Stable | MariaDB 10.1.13 |
1.0 | Gamma | MariaDB 10.0.13 |
1.0 | Unknown | MariaDB 10.0.11 |
1.0 | N/A | MariaDB 5.5 |
System Variables
rpl_semi_sync_master_enabled
- Description: Set to
ON
to enable semi-synchronous replication primary. Disabled by default. - Commandline:
--rpl-semi-sync-master-enabled[={0|1}]
- Scope: Global
- Dynamic: Yes
- Data Type:
boolean
- Default Value:
OFF
rpl_semi_sync_master_timeout
- Description: The timeout value, in milliseconds, for semi-synchronous replication in the primary. If this timeout is exceeded in waiting on a commit for acknowledgement from a replica, the primary will revert to asynchronous replication.
- When a timeout occurs, the Rpl_semi_sync_master_status status variable will also be switched to
OFF
. - See Configuring the Primary Timeout for more information.
- When a timeout occurs, the Rpl_semi_sync_master_status status variable will also be switched to
- Commandline:
--rpl-semi-sync-master-timeout[=#]
- Scope: Global
- Dynamic: Yes
- Data Type:
numeric
- Default Value:
10000
(10 seconds) - Range:
0
to18446744073709551615
rpl_semi_sync_master_trace_level
- Description: The tracing level for semi-sync replication. Four levels are defined:
1
: General level, including for example time function failures.16
: More detailed level, with more verbose information.32
: Net wait level, including more information about network waits.64
: Function level, including information about function entries and exits.
- Commandline:
--rpl-semi-sync-master-trace-level[=#]
- Scope: Global
- Dynamic: Yes
- Data Type:
numeric
- Default Value:
32
- Range:
0
to18446744073709551615
rpl_semi_sync_master_wait_no_slave
- Description: If set to
ON
, the default, the replica count (recorded by Rpl_semi_sync_master_clients) may drop to zero, and the primary will still wait for the timeout period. If set toOFF
, the primary will revert to asynchronous replication as soon as the replica count drops to zero. - Commandline:
--rpl-semi-sync-master-wait-no-slave[={0|1}]
- Scope: Global
- Dynamic: Yes
- Data Type:
boolean
- Default Value:
ON
rpl_semi_sync_master_wait_point
- Description: Whether the transaction should wait for semi-sync acknowledgement after having synced the binlog (
AFTER_SYNC
), or after having committed in storage engine (AFTER_COMMIT
, the default).- When this variable is set to
AFTER_SYNC
, the primary performs the following steps:- Prepares the transaction in the storage engine.
- Syncs the transaction to the binary log.
- Waits for acknowledgement from the replica.
- Commits the transaction to the storage engine.
- Returns an acknowledgement to the client.
- When this variable is set to
AFTER_COMMIT
, the primary performs the following steps:- Prepares the transaction in the storage engine.
- Syncs the transaction to the binary log.
- Commits the transaction to the storage engine.
- Waits for acknowledgement from the replica.
- Returns an acknowledgement to the client.
- In MariaDB 10.1.2 and before, this system variable does not exist. However, in those versions, the primary waits for the acknowledgement from replicas at a point that is equivalent to
AFTER_COMMIT
. - See Configuring the Primary Wait Point for more information.
- When this variable is set to
- Commandline:
--rpl-semi-sync-master-wait-point=value
- Scope: Global
- Dynamic: Yes
- Data Type:
enum
- Default Value:
AFTER_COMMIT
- Valid Values:
AFTER_SYNC
,AFTER_COMMIT
rpl_semi_sync_slave_delay_master
- Description: Only write primary info file when ack is needed.
- Commandline:
--rpl-semi-sync-slave-delay-master[={0|1}]
- Scope: Global
- Dynamic: Yes
- Data Type:
boolean
- Default Value:
OFF
rpl_semi_sync_slave_enabled
- Description: Set to
ON
to enable semi-synchronous replication replica. Disabled by default. - Commandline:
--rpl-semi-sync-slave-enabled[={0|1}]
- Scope: Global
- Dynamic: Yes
- Data Type:
boolean
- Default Value:
OFF
rpl_semi_sync_slave_kill_conn_timeout
- Description: Timeout for the mysql connection used to kill the replica io_thread's connection on primary. This timeout comes into play when stop slave is executed.
- Commandline:
--rpl-semi-sync-slave-kill-conn-timeout[={0|1}]
- Scope: Global
- Dynamic: Yes
- Data Type:
numeric
- Default Value:
5
- Range:
0
to4294967295
rpl_semi_sync_slave_trace_level
- Description: The tracing level for semi-sync replication. The levels are the same as for rpl_semi_sync_master_trace_level.
- Commandline:
--rpl-semi-sync-slave-trace_level[=#]
- Scope: Global
- Dynamic: Yes
- Data Type:
numeric
- Default Value:
32
- Range:
0
to18446744073709551615
Options
init-rpl-rol
- From MariaDB 10.6.19, MariaDB 10.11.9, MariaDB 11.1.6, MariaDB 11.2.5, MariaDB 11.4.3 and MariaDB 11.5.2, changes the condition for semi-sync recovery to truncate the binlog to instead use this option, when set to SLAVE. This avoids a possible error state where the replica’s state is ahead of the primary’s. See -init-rpl-role.
rpl-semi-sync_master
- Description: Controls how the server should treat the plugin when the server starts up.
- Valid values are:
OFF
- Disables the plugin without removing it from the mysql.plugins table.ON
- Enables the plugin. If the plugin cannot be initialized, then the server will still continue starting up, but the plugin will be disabled.FORCE
- Enables the plugin. If the plugin cannot be initialized, then the server will fail to start with an error.FORCE_PLUS_PERMANENT
- Enables the plugin. If the plugin cannot be initialized, then the server will fail to start with an error. In addition, the plugin cannot be uninstalled with UNINSTALL SONAME or UNINSTALL PLUGIN while the server is running.
- See Plugin Overview: Configuring Plugin Activation at Server Startup for more information.
- Valid values are:
- Commandline:
--rpl-semi-sync-master=value
- Data Type:
enumerated
- Default Value:
ON
- Valid Values:
OFF
,ON
,FORCE
,FORCE_PLUS_PERMANENT
- Removed: MariaDB 10.3.3
rpl-semi-sync_slave
- Description: Controls how the server should treat the plugin when the server starts up.
- Valid values are:
OFF
- Disables the plugin without removing it from the mysql.plugins table.ON
- Enables the plugin. If the plugin cannot be initialized, then the server will still continue starting up, but the plugin will be disabled.FORCE
- Enables the plugin. If the plugin cannot be initialized, then the server will fail to start with an error.FORCE_PLUS_PERMANENT
- Enables the plugin. If the plugin cannot be initialized, then the server will fail to start with an error. In addition, the plugin cannot be uninstalled with UNINSTALL SONAME or UNINSTALL PLUGIN while the server is running.
- See Plugin Overview: Configuring Plugin Activation at Server Startup for more information.
- Valid values are:
- Commandline:
--rpl-semi-sync-slave=value
- Data Type:
enumerated
- Default Value:
ON
- Valid Values:
OFF
,ON
,FORCE
,FORCE_PLUS_PERMANENT
- Removed: MariaDB 10.3.3
Status Variables
For a list of status variables added when the plugin is installed, see Semisynchronous Replication Plugin Status Variables.