Using OpenZFS with ZFS encryption and replication

Hi all.

I have successfully made a proof of concept of high availability, easy maintenance, highest file integrity (preventing loss of data from corruption or damaged disks) and high security, using only Open source and proven tools.

Here is list of tools involved:

  • OpenZFS as file system
  • Apache2 as webserver for each node
  • PHP 8.2 FPM for each node
  • Redis for file locking and cross node caching
  • PostgreSQL for excellent tuning and stream replication (native loadbalancing and replication)
  • HAProxy as reverse proxy

OpenZFS
This is excellent for mirrored discs. Furthermore it adds features perfect for load balanced/high availability setups. Not only is OpenZFS replication order of magnitudes faster than tools like rsync, it also natively supports snapshots. What it also adds, is the OpenZFS encryption. This allows for encryption of your datasets (not the ZFS structure though), without any inconveniencies to all the nice features of OIpenZFS. This allows for an extra read-only mirror in a public cloud somewhere, for the ultimate backup location.
By using mirrored discs on independant machines, I successfully made it possible for each machine to allow local LXC containers, use the same dataset/pool. So when creating new nodes of nextcloud PHP-FPM and apache servers, they simply attached the LXC host data pool, and my central Redis is expanded with this new node, and file locking and caching is now activated.
It is ridicoulessly simple to setup, however it is not a setup for beginners. I do - however - recommend this approach for further enterprise level setup, as so far it seems to perform very well, and there is less moving parts and more native features that handles the tricky parts.

The more I explores from OpenZFS and the more cool features I unveils, the more I am confident that OpenZFS is the best filesystem to use for Nextcloud.
With the stream replication features and supreme tuning offered by PostgreSQL, I have so far been able to perform better, more solid and with fewer errors, than I did with MariaDB. And yes, I did try for a few years to constantly tweak, tune and improve performance of MariaDB.

awesome work @Kerasit

Any chance you’ve got instructions and/or code (ansible/ puppet/ …) published for this setup?

I am on vacation, so not yet. And as this is a poc, I have not setup any automation yet, so no scripts. A full and proper POC documentation will follow, However this is tailored specifically to my own infrastructure. It is in no way ready for any “T-Shirt sizes” and any setup with this needs specific tailoring for each organizations unique infrastructure. As I writes, this is not for beginners.

To make a setup like this, you need the perfect conditions.

First of all I am not using docker. I am an LXC fanboy, but I see no reason why or how you should not be able to do this kind of setup with docker containers.

What is important for this setup is that the OpenZFS hosts are are bare metal servers. The main load balancer and reverse proxy, is bare metal. For maximum effect, then each postgresql is installed as LXC containers, sure, but on seperate LXD hosts. For this POC they are running on the same hosts as the OpenZFS, however the storage used for the databases, is on local pools, on the EXT4 filesystems. I would have made independant OpenZFS discs, raids and volumes tailored for databases if I have had the resources, but this is private project, and I do not have such amount of resources.

Anything “dynamic” is based on containers:
Database nodes, and “full fledged” nextcloud apache2 and php-fpm servers.

The Redis server is also an LXC, however there is only one and could be bare metal as well.
This works because my network layer, DNS setup, HAProxy setup and other stuff is setup in a certain way, possible for my own infrastructure. This is why I states this a proof of concept and not a solution ready for others. And I bet this will never be an officially supported solution, as this is dependant on infrastructure and middleware, more than on self-contained isolated “towers”, like the AIO packages.

Use it as inspiration for designing your own, matching what components you are using yourself.