Help with Setup using Docker and ZFS Data Storage

I am installing Nextcloud on a new server running Ubuntu Server 20.04 and Docker. I am not running Root on ZFS, but my storage drives for Docker volumes and files / media are using ZFS filesystem.

I have seen so many variations with regards to installing Nextcloud and so I could use some input on what is the recommended approach for my case. The server will be used by 3-5 users. I would like to set it up so that I can manage snapshots / backup of each users storage location (dataset) individually as well as the SQL database and the Redis database.

From what I have gathered I need to deploy 3 containers; NextCloud, MariaDB, and Redis. I have also seen mentions of a Crone container…is this required, if so what is it for?

I have some other services that also will use the MariaDB. Is it possible to set this up so that each database is stored on a dedicated/separate dataset?

Appreciate any input and feedback.

there are scheduled jobs. docker executes one task per container - apps container runs the webserver - cron runs scheduled jobs (you use same image, same DB, same volumes).

Docker philosophy would be not to share the DB container. usually you would setup one DB container for every application (better separation e.g. you can isolate the DB so only containers related to specific application can access particular DB)

it is possible but it makes no sense in my eyes - the files in the file system and records in the DB belong together and backup/recover subset of the whole system results in inconsistent state. This may be not a problem but may become a problem. No need to backup redis DB.

yes it is possible - you define volumes of the specific containers to use different data sets, that’s all.

i create cron jobs on the host to run the jobs via docker exec ... . there is no need to create a cron container.

yes. but kiss. keep it simple&stupid.

in other words you can use my playbook to setup nextcloud on docker within minutes. you find scripts to remove everything in seconds. so you test everything and experiment a bit. you can setup your “zfs snapshot backup” and try to restore.

you may enable portainer in the inventory to get a nice gui to run your container. helpfull if you are new to container.

1 Like

What scheduled tasks does these cron jobs do? Are they required to run NC (there are many guides that does not mention them at all)?

Thank you for the link. I’ll check it out.

it’s a php script for maintenance. you’ll get a warning if you don’t run it regular.

I’m not sure I follow what you are saying here. Several tutorials do not mentions cron jobs at all. I guess my question is, when and how do I create these cron jobs? Is it only for special use-cases of NC?

That makes sense. I guess there is not much additional overhead in running a DB container pr. service.

So from what you are saying I would be best of just putting the entire NC data on one main dataset, with not “sub” datasets pr. user or other data?

Ok, I see. So it is basically just as scheduled run of a predefined PHP script provided with NC. I thought maybe that the actions themselves were something user-defined.

what do you want to achieve?

I would consider all data within one NC installation as one piece of data. and you backup and restore this single piece of data in one go. The advantage of docker is you can run multiple instances like testing and production.

And one hint from me - from what you say I got the feeling you consider ZFS as you backup strategy… don’t do this. this could be a part of backup strategy but please always perform some additional backup procedures and store the data outside of you original system (NAS, USB drive, cloud storage).

Don’t forget you ZFS most likely can protect files pretty good but it is not aware of you database state and sooner than later a snapshot will hold corrupted DB file - you should run real DB-aware backup task e.g. dump to save a consistent DB state.

Soo…this was put on hold since summer, but hoping to get back at it again now :slight_smile:

With regards to ZFS, I am definitely not considering it as my backup strategy, however, if possible I would like to utilize it’s snapshot capabilities for my backup of NextCloud data. However, this is new to me and I am not sure of the best approach.

From what I gather I will only need to back up the locations where the actual NextCloud data is stored (/var/www/html/data, which will be bind mounted to a ZFS dataset on my main storage HDDs) as well as the MariaDB database (which will stored on a Docker ZFS volume under /var/lib/docker/volumes/).

I assume backup of these two datasets must happen at the same time so that data and database are in synch? Can this be done while the applications / container are running or will they have to be stopped while performing backup (in case writes are happening while backup is in progress)?

I was thinking it might be easier to handle the data-storage on a pr. user basis, for cases where a user for instance might want to mover their data from the NextCloud instance over to something else.

Btw. does anyone know if the I have to freeze the database when taking a ZFS snapshot and if so how that is achieved?

it could work with running DB but there is no guarantee - you should always stop the DB from changing files and confirm files are in a consistent state before you take a snapshot (this applies to all snapshot techniques e.g. VM snapshot).

I think you have script the process e.g.

  • stop DB (enter maintenance mode might be enough)
  • perform ZFS snapshot
  • start DB (stop maintenance mode)

Thanks for the feedback. Good to know.

So I should probably have to make some pre- and post-snapshot scripts then. When you say maintenance mode, you mean on NextCloud?

I see there are some mentions of creating a mysqldump when doing a backup of the DB, but would this be required when taking a ZFS snapshot of the whole DB dataset?

your DB files on the disk must be consistent before you can count on them!