Nice. So I can setup my Galera again and then use HAProxy in a active/backup manner? Can you confirm it is safe to use? I only need a failover DB, donāt care about balancing reads.
So hold on. In my 3 node Galera cluster sitting behind a HAProxy:
listen galera
bind 10.10.20.8:3306
balance leastconn
mode tcp
option tcpka
option mysql-check user haproxy
server node1 10.10.20.4:3306 check weight 1
server node2 10.10.20.5:3306 check weight 1
server node3 10.10.20.6:3306 check weight 1
This is fine? It will only connect to one of those at a time, and they obviously sync almost immediately to the other nodes doing their cluster thing.
If I had a maxscale the only difference I guess would be Iād dedicate one of those for writes permanently and the other two for reads rather than a round-robin or leastconn
Correct. HAProxy for failover only should be totally fine.
Ugh thatās the worst .
So I canned the PoC for no reason based on information above
I guess Iāll kick it back off.
Instead, my setup was correct
Iām still pondering this. How does HAProxy get around:
A multi-master setup with Galera cluster is not supported, because we require READ-COMMITTED as transaction isolation level. Galera doesnāt support this with a master-master replication which will lead to deadlocks during uploads of multiple files into one directory for example.
Thanks for the good discussion. I am looking to set up a failover server soon for my nextcloud install (well, actually a new server will become the master).
Itās good to hear thereās a relatively clear path forward for this.
Iām planning on using ceph for the storage layer.
Out of interest, would this be easier if we use postgresql instead of mariadb?
Sorry I donāt run a clustered environment (yet), but what Iāve read it should be easier with PostgreSQL, though in the Nextcloud documents it says otherwise. The reason for this I think is that the Nextcloud devs are more familiar with MariaDB and therefore also got it to work smoother than with PostgreSQL.
Given your knowledge and comments around postgre I figured youād have some good input, was not disappointed. Ta
Just wanted to join the interesting conversation.
As far as I understand the Galera docs, the transaction isolation is the same as on a single instance, when you connect only to a single cluster node (maybe thatās the meaning of Master-Slave in Galera).
Overall isolation levels that are supported cluster-wide are
- READ-UNCOMMITTED
- READ-COMMITTED
- REPEATABLE-READ
For transactions issued on different nodes, isolation is also strengthened by the āfirst committer winsā rule, which eliminates the ālost update anomalyā inherent to these levels, whereas for transactions issued on the same node this rule does not hold (as per original MySQL/InnoDB behavior).
Thereās only one exception, when the cluster is configured with wsrep_sync_wait = 1
which enforces strict cluster-wide causality checks but can of course result in increased latency.
When you enable this parameter, the node triggers causality checks in response to certain types of queries. During the check, the node blocks new queries while the database server catches up with all updates made in the cluster to the point where the check was begun. Once it reaches this point, the node executes the original query.
Setting wsrep_sync_wait
to 1
checks on READ statements, including SELECT, and BEGIN / START TRANSACTION.
This way I was able to use MaxScale Load Balancer in front of a Galera Cluster with the ReadWriteSplit module on my dev instance. Works without any problem atm, but needs further testing with way more users.
The better approach here would be to use wsrep_sync_wait = 1
in the Nextcloud code in case a Galera Cluster is used for the DB backend. When a query is send to the DB which needs READ-COMMITTED isolation the SESSION can be adjusted e.g.:
SET SESSION wsrep_sync_wait=1;
SELECT * FROM example WHERE field = āvalueā;
SET SESSION wsrep_sync_wait=0
I have to chip in here and ask what the current stance on the situation is.