Backup in reasonable time

nc15
backup

#1

Hello,

I have the following setup:

  • Ubuntu 18.04 Server (older hardware, 2GB RAM), small SSD for the system
  • 4TB HDD for Nextcloud data (currently about 1,2 TB used)
  • 5TB external HDD (2,5 inch, USB 3) for Backup, LUKS encrypted

Before automating anything I tried to do a manual Backup (as suggested here, in german):

cd /var/www/html/nextcloud
sudo -u www-data php occ maintenance:mode --on
sudo tar -cpzf /mnt/NC_Backup_scripts_`date +"%Y%m%d"`.tar.gz -C /var/www/html/nextcloud .
mysqldump --single-transaction -h localhost --all-databases -u username -p > /mnt/NC_Backup_DB_`date +"%Y%m%d"`.sql
sudo tar -cpzf /mnt/NC_Backup_DataDir_`date +"%Y%m%d"`.tar.gz -C /media/storage/data .
cd /var/www/html/nextcloud
sudo -u www-data php occ maintenance:mode --off

Problem: Backing up the data dir that way would take at least 16 hours (!) for one TB.
I had the server in maintenance between midnight and 8 in the morning (about 500 GB backed up till then), but had to switch it back on so that it can be used.


So I am looking for ideas how to speed that up significantly, so that I can do a daily backup in some way.

  • Shall I use an unencrypted backup drive?
  • A faster backup hdd?
  • Do I need a faster CPU/Server to do the compressing/encrypting in reasonable time?
  • Or do I need some big Raid/LVM System where I can make snapshots etc.?
  • Or do something like an rsync to a second internal hdd, and then back up that one (which can take the whole day…)?

Thanks for any ideas to get a proper setup. I am ready to buy a new machine, new hdds etc. if needed. Just would like to follow some best practice, and not reinvent the wheel.

Thanks,

Thomas


#2

Hi,

Maybe for a faster backup you could parallelise rsync using fpsync, a tool provided by fpart.

And tar after the rsync processes are finished.


#3

What is around 17 MB per second that is not so bad for 2 Cores old CPU. Try to not using compression, this will increase your backup speed a bit, but only if you hit CPU limits.

It is already not bad to have 17 MB per second transferred on older CPU.

Try to use rsync, or fpsync. You even do not need to put server in maintenance mode. I’m using this script to simply backup to remote FS.

Just run something like:

rsync -a --exclude=data/updater* --exclude=*.ocTransferId*.part --partial --info=progress2 --delete NextCloudPath BackupFolder/nextcloud/

I’m personally really hit my Network bandwidth limits by using rsync, so fpsync will not bring me any benefit.
Investigate what is your problem:

  1. hit CPU Limits - try to disable compression. Encryption will also consumes your CPU.
    I’m also doing like this: NC Server has an old 2 core CPU, so I use to move compression and encryption tasks to backup machine:
    tar -cvp /NC/FOLDER | nc -q 0 receivingHost 8080
    on Backup Server side using compression and encryption with LUKS, because CPU is more efficient.
    nc -l 8080 > gzip backup.tar.gz

  2. hit HDD Limits - check you HDD speed, may be it is old and could not handle more MBps of write.

  3. hit USB Limits - try to move HDD to other machine and use Network (if you have more then 1Gbps Net). But it seems you are using USB 3.0 so have a look on a first hint.


#4

Thanks for the info, yes it seems the CPU is the limit. I have some 2-3 load since the backup started, which means the server is quite slow (web interface really slow etc.).

I am still thinking about the overall concept of the backup.

e.g. simply copying files (RAID or rsync) is not a “real” backup, because there are no file versions (“restore the file from 2 weeks ago”) and accidently deleted files may be deleted too (in RAID for sure, in rsync depending on the options).

But it is also unrealistic to restore a whole Nextcloud Server with 1 TB of data to a 2-weeks-old state cause 1 file is missing.
(I am not sure if the Nextcloud server itself keeps older file versions?)

So an idea is to just copy the server data for the purpose of a restore (if the server/hdd breaks) - which means that this is not a real backup.

And then do the “real” file backups on the clients with a software like Deja dup (or Time machine on a mac…) - to backup the Nextcloud sync folder. Such a software would allow to restore older versions etc. via a nice GUI.
Problem of that: Several laptops just have 256GB SSDs, so only a small part of whats on the server can be kept in sync. At least one client would be needed with enough space to fit ALL server data.

I do fear the situation that the server(hardware/system ssd…) breaks, which means I have a nice backup of all data, but it takes me days to reinstall everything and get a new server with the data up and running.
To be safer with that it may make sense to keep a second server (with rsynced data) running (maybe use that rsynced server to to the actual backup work to external hdds).


It is all quite confusing. In the past I trusted gdrive and a cloud2cloud backup (Spanning), now I switched to Nextcloud. Works really well, but I didn’t figure out yet how to do backup well (including a clear path to restore the server when hardware breaks etc.).


#5

Did not catch this use case. If your user delete 1, or X files, you do not need to restore whole X GB of files, just find this in backup and copy it into user folder and run command to rescan it:

sudo -u www-data php occ files:scan rescans whole DATA Folder (will take a while)

sudo -u www-data php occ files:scan --path user_id/files/path/to/file rescans only restored folder in the folder.

Or, you could use filesystem snapshots (from btrfs, zfs, etc) to restore snapshot, but not in disaster when you HDD dies. There is even App to manage it https://apps.nextcloud.com/apps/files_snapshots

With a Versions app - yes. This App is part of official delivery.
You can even setup it to save all versions for at least X days: https://docs.nextcloud.com/server/15/admin_manual/configuration_server/config_sample_php_parameters.html#file-versions

Backup of NC should have 2 parts: Backup of DB (you did it with mysqldump command) and backup of data folder. Optionally config or whole NC folder with Apps.

If you are going to “cheap” solution without Raid - just do periodically rsync to your external HDD. This will keeps your data safe. Otherwise have a look into RAID.

It is possible with rsync also, or user restic Rsync to cloud storage for backups? But without GUI :slight_smile:


#6

Current status:
Found out: The server has only USB2. The max data transfer is about 30 MB/sec uncompressed, and about 20 MB/sec with gzip (CPU limit).

Also my tar backup command stopped working. It worked just once/first backup, backing up about 1TB. Now the data amount slightly grew (about 1,3TB) and the command stops at about 100-200GB (no matter if just using uncompressed tar or tar and gzip, tried both).

So rsync seems to be the only way to go, making 1 synced version by day (hardlinking the unchanged files - need to learn more about that).
I read that rsync needs lots of RAM, so I am sceptical if that will work on this machine (1,3 TB of data, over 300k files).

I would need to add a USB3 Extension card and more RAM - or just get a newer (used) machine.
Its amazing how well Nextcloud can handle/serves that amount of data on such an old machine, but when it comes to backup (and restore…) I see real limits.


#7

I use rsnapshot. It only transfers (and stores) the difference since the last backup and let’s you access the state x hours/days/weeks back. There is a whole range of similar scripts, chose the one that suits you most.


#8

I use now a rsync script based on (DE) https://wiki.ubuntuusers.de/Skripte/Backup_mit_RSYNC/

I think it is similar to rsnapshot. Basically it makes a new folder for every day, copies only new files and hardlinks all not-changed files.
(This is btw. also done by Apples Time-Machine, just with a fancier interface)

I had no problems with RAM - the Machine has 2 GB of RAM and uses about 1 GB of Swap (on a SSD) - rsync worked well (syncing over 300k files…). During the Backup the load goes quite up, but Nextcloud is still useable.

I think now about getting a slightly newer (desktop) machine, for USB3, more RAM and a faster CPU. Also LVM/LVM Snapshots could be helpful.


#9

Exactly.

Or directly use a COW filesystem (ZFS/btrfs). Even NCPi on raspberry uses btrfs.