Help me test this 3 node cluster

aventrax · July 19, 2017, 3:26pm

Nice. So I can setup my Galera again and then use HAProxy in a active/backup manner? Can you confirm it is safe to use? I only need a failover DB, don’t care about balancing reads.

JasonBayton · July 19, 2017, 3:47pm

So hold on. In my 3 node Galera cluster sitting behind a HAProxy:

listen galera
bind 10.10.20.8:3306
     balance leastconn
     mode tcp
     option tcpka
     option mysql-check user haproxy
     server node1 10.10.20.4:3306 check weight 1
     server node2 10.10.20.5:3306 check weight 1
     server node3 10.10.20.6:3306 check weight 1

This is fine? It will only connect to one of those at a time, and they obviously sync almost immediately to the other nodes doing their cluster thing.

If I had a maxscale the only difference I guess would be I’d dedicate one of those for writes permanently and the other two for reads rather than a round-robin or leastconn

MorrisJobke · July 20, 2017, 8:43am

Correct. HAProxy for failover only should be totally fine.

JasonBayton · July 20, 2017, 9:09am

Ugh that’s the worst .
So I canned the PoC for no reason based on information above

I guess I’ll kick it back off.

aventrax · July 20, 2017, 9:35am

Instead, my setup was correct

JasonBayton · August 13, 2017, 10:03pm

I’m still pondering this. How does HAProxy get around:

A multi-master setup with Galera cluster is not supported, because we require READ-COMMITTED as transaction isolation level. Galera doesn’t support this with a master-master replication which will lead to deadlocks during uploads of multiple files into one directory for example.

zeigerpuppy · September 8, 2017, 9:55am

Thanks for the good discussion. I am looking to set up a failover server soon for my nextcloud install (well, actually a new server will become the master).

It’s good to hear there’s a relatively clear path forward for this.
I’m planning on using ceph for the storage layer.

Out of interest, would this be easier if we use postgresql instead of mariadb?

JasonBayton · September 8, 2017, 10:07am

@enoch85 what do you know on this?

enoch85 · September 8, 2017, 3:41pm

Sorry I don’t run a clustered environment (yet), but what I’ve read it should be easier with PostgreSQL, though in the Nextcloud documents it says otherwise. The reason for this I think is that the Nextcloud devs are more familiar with MariaDB and therefore also got it to work smoother than with PostgreSQL.

JasonBayton · September 8, 2017, 9:37pm

Given your knowledge and comments around postgre I figured you’d have some good input, was not disappointed. Ta

enoch85 · September 9, 2017, 12:57pm

NP @JasonBayton. Anytime!

DasLeo · November 9, 2017, 9:51am

Just wanted to join the interesting conversation.

As far as I understand the Galera docs, the transaction isolation is the same as on a single instance, when you connect only to a single cluster node (maybe that’s the meaning of Master-Slave in Galera).

Overall isolation levels that are supported cluster-wide are

READ-UNCOMMITTED

READ-COMMITTED

REPEATABLE-READ

For transactions issued on different nodes, isolation is also strengthened by the “first committer wins” rule, which eliminates the “lost update anomaly” inherent to these levels, whereas for transactions issued on the same node this rule does not hold (as per original MySQL/InnoDB behavior).

There’s only one exception, when the cluster is configured with wsrep_sync_wait = 1 which enforces strict cluster-wide causality checks but can of course result in increased latency.

When you enable this parameter, the node triggers causality checks in response to certain types of queries. During the check, the node blocks new queries while the database server catches up with all updates made in the cluster to the point where the check was begun. Once it reaches this point, the node executes the original query.

Setting wsrep_sync_wait to 1 checks on READ statements, including SELECT, and BEGIN / START TRANSACTION.

This way I was able to use MaxScale Load Balancer in front of a Galera Cluster with the ReadWriteSplit module on my dev instance. Works without any problem atm, but needs further testing with way more users.

The better approach here would be to use wsrep_sync_wait = 1 in the Nextcloud code in case a Galera Cluster is used for the DB backend. When a query is send to the DB which needs READ-COMMITTED isolation the SESSION can be adjusted e.g.:

SET SESSION wsrep_sync_wait=1;
SELECT * FROM example WHERE field = “value”;
SET SESSION wsrep_sync_wait=0

Peek · February 5, 2021, 7:43am

I have to chip in here and ask what the current stance on the situation is.