Help me test this 3 node cluster

Nice. So I can setup my Galera again and then use HAProxy in a active/backup manner? Can you confirm it is safe to use? I only need a failover DB, donā€™t care about balancing reads.

So hold on. In my 3 node Galera cluster sitting behind a HAProxy:

listen galera
bind 10.10.20.8:3306
     balance leastconn
     mode tcp
     option tcpka
     option mysql-check user haproxy
     server node1 10.10.20.4:3306 check weight 1
     server node2 10.10.20.5:3306 check weight 1
     server node3 10.10.20.6:3306 check weight 1

This is fine? It will only connect to one of those at a time, and they obviously sync almost immediately to the other nodes doing their cluster thing.

If I had a maxscale the only difference I guess would be Iā€™d dedicate one of those for writes permanently and the other two for reads rather than a round-robin or leastconn

Correct. HAProxy for failover only should be totally fine.

Ugh thatā€™s the worst :stuck_out_tongue:.
So I canned the PoC for no reason based on information above :frowning:

I guess Iā€™ll kick it back off.

Instead, my setup was correct :slight_smile:

Iā€™m still pondering this. How does HAProxy get around:

A multi-master setup with Galera cluster is not supported, because we require READ-COMMITTED as transaction isolation level. Galera doesnā€™t support this with a master-master replication which will lead to deadlocks during uploads of multiple files into one directory for example.

Thanks for the good discussion. I am looking to set up a failover server soon for my nextcloud install (well, actually a new server will become the master).

Itā€™s good to hear thereā€™s a relatively clear path forward for this.
Iā€™m planning on using ceph for the storage layer.

Out of interest, would this be easier if we use postgresql instead of mariadb?

@enoch85 what do you know on this? :slight_smile:

Sorry I donā€™t run a clustered environment (yet), but what Iā€™ve read it should be easier with PostgreSQL, though in the Nextcloud documents it says otherwise. The reason for this I think is that the Nextcloud devs are more familiar with MariaDB and therefore also got it to work smoother than with PostgreSQL.

Given your knowledge and comments around postgre I figured youā€™d have some good input, was not disappointed. Ta :wink:

NP @JasonBayton. Anytime!

Just wanted to join the interesting conversation.

As far as I understand the Galera docs, the transaction isolation is the same as on a single instance, when you connect only to a single cluster node (maybe thatā€™s the meaning of Master-Slave in Galera).

Overall isolation levels that are supported cluster-wide are

  • READ-UNCOMMITTED
  • READ-COMMITTED
  • REPEATABLE-READ

For transactions issued on different nodes, isolation is also strengthened by the ā€œfirst committer winsā€ rule, which eliminates the ā€œlost update anomalyā€ inherent to these levels, whereas for transactions issued on the same node this rule does not hold (as per original MySQL/InnoDB behavior).

Thereā€™s only one exception, when the cluster is configured with wsrep_sync_wait = 1 which enforces strict cluster-wide causality checks but can of course result in increased latency.

When you enable this parameter, the node triggers causality checks in response to certain types of queries. During the check, the node blocks new queries while the database server catches up with all updates made in the cluster to the point where the check was begun. Once it reaches this point, the node executes the original query.

Setting wsrep_sync_wait to 1 checks on READ statements, including SELECT, and BEGIN / START TRANSACTION.

This way I was able to use MaxScale Load Balancer in front of a Galera Cluster with the ReadWriteSplit module on my dev instance. Works without any problem atm, but needs further testing with way more users.

The better approach here would be to use wsrep_sync_wait = 1 in the Nextcloud code in case a Galera Cluster is used for the DB backend. When a query is send to the DB which needs READ-COMMITTED isolation the SESSION can be adjusted e.g.:

SET SESSION wsrep_sync_wait=1;
SELECT * FROM example WHERE field = ā€œvalueā€;
SET SESSION wsrep_sync_wait=0

4 Likes

I have to chip in here and ask what the current stance on the situation is.