MariaDB Operations

MariaDB Database operations

MariaDB is a relational database service that is used to store user configuration, platform & application configuration, alerts, and incident data of cfxDimenions platform and respective application services. MariaDB supports high availability natively and it can be deployed as Master/Slave or Master/Master configuration using the Galera clustering feature. Within the cfxDimensions platform, MariaDB has deployed as Master/Master (Galera cluster) node configuration. MariaDB service is containerized and configured in a specific way to be compatible with cfxDimensions platform and its application services.

For detailed general documentation, please refer to About MariaDB and About Galera Cluster

MariaDB database mount point on each cluster node:

  • Data mount point:

    • /var/mysql

  • DB service logs path:

    • /opt/macaw/shared/log/mariadb/<node-ip>/mariadb.log

MariaDB Galera Cluster graceful start & stop sequence:

Run the below macaw CLI command from Platform VM to start MariaDB cluster service on 3 nodes.

macaw infra start --service mariadb

The above command brings up each MariaDB Node in sequential order. It brings up Node01 first to bootstrap the cluster and starts the Node02 & Node03 subsequently to join the MariaDB Galera cluster.

When Node01 is started first to bootstrap the MariaDB galera cluster, it starts with the below parameter set inside the MariaDB configuration file. (/opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1 (last line of the configuration file)

Once the MariaDB Galera cluster is functionally up and running, the determination of the bootstrap node next time to start the cluster depends on the shutdown sequence of the cluster nodes. The cluster node which was stopped last should be used to bootstrap the MariaDB Galera cluster next time.

Run the below macaw CLI command from Platform VM to stop the MariaDB cluster service on 3 nodes gracefully.

macaw infra stop --service mariadb

The above command stops the Node03 first, Node02 next, and finally the Node01. In this sequence, since Node01 is stopped last, Node01 always becomes the bootstrap node to start and initializes the Galera cluster appropriately.

Three node MariaDB Galera cluster provides high availability with a tolerance of 1 node failure.

MariaDB Galera cluster multi-node recovery on power failure or a full crash:

If the MariaDB Galera cluster nodes are crashed because of power failure on all servers or because of some other server hardware failure, the cluster needs to be brought up carefully in a particular order to avoid any data loss.

First, we need to identify which node is eligible to bootstrap the MariaDB Galera cluster. For that, below are the two available methods.

  1. Identify the node which has highest seqno value (Note: A cluster node will only have positive highest seqno value when atleast one of the node was able to gracefully shutdown. This is the node that needs to be started first to bootstrap the MariaDB Galera cluster.)

  2. Identify the node which has recorded the last committed transaction

Recover MariaDB Galera cluster using a Node which has the highest seqno value:

Login into MariaDB cluster Node03 using SSH client to access the CLI. (username: macaw)

Following shows the content of grastate.dat in Node03. In this example, this node has negative seqno and no group ID (uuid). This is the case when a node crashes during Data Definition Language (DDL) processing:

$ cat /var/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 00000000-0000-0000-0000-000000000000
seqno: -1
safe_to_bootstrap: 0

The following shows the content of grastate.dat in Node02. In this example, this node was crashed during a transaction processing which resulting in negative seqno but with a group ID (uuid):

$ cat /var/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 886dd8da-3d07-11e8-a109-8a3c80cebab4
seqno: -1
safe_to_bootstrap: 0

Following is the content of grastate.dat on Node01 with the highest seqno value:

$ cat /var/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 886dd8da-3d07-11e8-a109-8a3c80cebab4
seqno: 31929
safe_to_bootstrap: 1

Note: If all of the cluster nodes contain the value of -1 for seqno and 0 for safe_to_bootstrap , that is an indication that a full cluster crash has occurred. Go to the Next Section of this document (Recover MariaDB Galera cluster using a Node that has committed the last transaction)

The MariaDB node with the highest seqno value is an appropriate Node to bootstrap the MariaDB Galera cluster.

Follow the below steps to bootstrap and bring up the MariaDB cluster:

Step-1: Login into the MariaDB bootstrap node using an SSH client as a macaw user. (bootstrap node is identified using one of the above-mentioned procedures)

Step-2: Edit mariadb.cnf configuration file and make sure the below parameter is set at the end of the file and save it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

Step-3: Edit /var/mysql/grastate.dat file and make sure safe_to_bootstrap value is set to 1and save the file.

Step-4: Start the MariaDB container using the below command.

docker start <mariadb-container-id>

After starting the MariaDB container, watch the log messages @ the below log file

/opt/macaw/shared/log/mariadb/<cluster-node-ip>/mariadb.log

and look for the below log message which confirms the Node is completely up and in the synced state.

WSREP: Server status change joined -> synced

Additionally, run the below command to verify the MariaDB cluster bootstrap node is completely up in the synced state.

mysql -u <username> -p<password> -h <node-ip> -P 3306 -e "show status like 'wsrep_local_state_comment';"

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

Once the MariaDB bootstrap cluster node is up, continue the below steps to bring up the rest of the 2 Nodes.

Step-5: Login into the MariaDB rest of the nodes (no specific order) using SSH client as macaw user.

Step-6: Edit mariadb.cnf configuration file and make sure the below parameter doesn't exist, if yes, just remove it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

The above parameter is applicable only on the MariaDB cluster's bootstrap node which initializes the Galera cluster.

Step-7: Start the MariaDB container using the below command.

docker start <mariadb-container-id>

After starting the MariaDB container, watch the log messages @ the below log file

/opt/macaw/shared/log/mariadb/<cluster-node-ip>/mariadb.log

and look for the below log message which confirms the Node is completely up and in the synced state.

WSREP: Server status change joined -> synced

Additionally, run the below command to verify the MariaDB cluster node is completely up in the synced state.

mysql -u <username> -p<password> -h <node-ip> -P 3306 -e "show status like 'wsrep_local_state_comment';"

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

When second or third nodes are coming up after the crash and syncing up with the Cluster's bootstrap node, it may take few minutes or a little longer to be completely up and synced state.

Step-8: On the last MariaDB node, please follow the procedure listed in Step-7

Step-9: Once the MariaDB cluster nodes are completely up and functional, login into Node01 and edit the MariaDB configuration file mariadb.cnf and make sure to add the below parameter at the end of the file and save it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

On Node2 & Node03, edit the MariaDB configuration file and make sure the above parameter is not set. This is to make sure macaw CLI starts the Node01 as cluster bootstrap node first when it is used to bring up the MariaDB cluster nodes.

MariaDB Galera cluster node order (i.e Node01, Node02 & Node03) is determined based on the order of comma-separated IP address list provided during themacaw setupcommand which configures initial configuration of the cfxDimensions platform.

Recover MariaDB Galera cluster using a Node that has committed the last transaction:

Step-1: Login into MariaDB cluster Node01 using SSH client to access the CLI. (username: macaw)

Step-2: Run the below command to find the Mariadb container ID

docker ps -a | grep mariadb

Please make sure the MariaDB container is in a stopped state or run the below command to stop the MariaDB container

docker stop -t 120 <mariadb-container-id>

Step-3: Take a backup of the MariaDB configuration file

cp /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf.bak

Step-4: Edit the MariaDB configuration file and add the below specified option. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep-recover=1

and comment out the below parameter (if it exists)

#wsrep_new_cluster=1

Step-5: Start the MariaDB container to start the MariaDB service and wait for 2 to 3 minutes to allow it to be completely up.

docker start <mariadb-container-id>

Step-6: Tail mariadb service log and look for similar to the below message. (/opt/macaw/shared/log/mariadb/<cluster-node-ip>/mariadb.log)

2021-06-07 9:50:36 0 [Note] WSREP: Recovered position: afa02221-c422-11eb-8a24-96c95f63c95b:397159

Note down the above highlighted value and follow the same steps from Step-4 through Step-6 for Node02 & Node03

The MariaDB node with the latest data will have the highest value and that is an appropriate Node to bootstrap the MariaDB Galera cluster.

Follow the below steps to bring up the MariaDB Galera cluster:

Step-1: Login into the MariaDB bootstrap node (which has the highest recovered position value) using an SSH client as a macaw user. (bootstrap node is identified using one of the above-mentioned procedures)

Step-2: Edit mariadb.cnf configuration file and make sure the below parameter is set at the end of the file and save it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

and remove the below line

wsrep-recover=1

Step-3: Edit /var/mysql/grastate.dat file and set safe_to_bootstrap value as 1and save the file.

Step-4: Start the MariaDB container using the below command.

docker start <mariadb-container-id>

After starting the MariaDB container, watch the log messages @ the below log file

/opt/macaw/shared/log/mariadb/<cluster-node-ip>/mariadb.log

and look for the below log message which confirms the Node is completely up and in the synced state.

WSREP: Server status change joined -> synced

Additionally, run the below command to verify the MariaDB cluster bootstrap node is completely up in the synced state.

mysql -u <username> -p<password> -h <node-ip> -P 3306 -e "show status like 'wsrep_local_state_comment';"

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

Once the MariaDB bootstrap cluster node is up, continue the below steps to bring up the rest of the 2 Nodes.

Step-5: Login into the MariaDB rest of the nodes (no specific order) using SSH client as macaw user.

Step-6: Edit mariadb.cnf configuration file and make sure the below parameter doesn't exist, if yes, just remove it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

and remove the below line as well

wsrep-recover=1

Step-7: Start the MariaDB container using the below command.

docker start <mariadb-container-id>

After starting the MariaDB container, watch the log messages @ the below log file

/opt/macaw/shared/log/mariadb/<cluster-node-ip>/mariadb.log

and look for the below log message which confirms the Node is completely up and in the synced state.

WSREP: Server status change joined -> synced

Additionally, run the below command to verify the MariaDB cluster node is completely up in the synced state.

mysql -u <username> -p<password> -h <node-ip> -P 3306 -e "show status like 'wsrep_local_state_comment';"

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

When second or third nodes are coming up after the crash and syncing up with the Cluster's bootstrap node, it may take few minutes or a little longer to be completely up and synced state.

Step-8: On the last MariaDB node, please follow the procedure listed in Step-6

Step-9: Once the MariaDB cluster nodes are completely up and functional, login into Node01 and edit the MariaDB configuration file mariadb.cnf and make sure to add the below parameter at the end of the file and save it. (Configuration file location: /opt/macaw/config/mariadb/<cluster-node-ip>/mariadb.cnf)

wsrep_new_cluster=1

On Node2 & Node03, edit the MariaDB configuration file and make sure the above parameter is not set. This is to make sure macaw CLI starts the Node01 as cluster bootstrap node first when it is used to bring up the MariaDB cluster nodes.

MariaDB Galera cluster node order (i.e Node01, Node02 & Node03) is determined based on the order of comma-separated IP address list provided during themacaw setupcommand which configures initial configuration of the cfxDimensions AIOps platform.

Last updated