Exclude files/folders from data backup

I got a nextcloud server running on docker. A cron job is periodically going into maintenence mode, creating a mySQL dump of the database and backing up the data folder (/var/www/html/data) to an external hard drive (as described here: https://docs.nextcloud.com/server/latest/admin_manual/maintenance/backup.html)

Since the backup is stored incrementally the used disk space is growing all the time and I ran into the problem that my backup disks are full.

My questions:
Can I exclude subfolders within my user account (like the files_trashbin folder or work folders which are not needed to be backed up) from my backup without running into problems when restoring my nextcloud from this backup (sql database)?
Will there be an inconsistence between the database and the data folder when some files or folders are missing?


Do you use this command in the documentation?

rsync -Aavx nextcloud/ nextcloud-dirbkp_`date +"%Y%m%d"`/

I think it is not an incremental backup. I think this are daily full backups.
You can test it with “stat”.

stat nextcloud-dirbkp_20220218/path/to/file

For older files the number of hardlinks (links, example Links: 1) must be greater than 1.
Please test it and post an example.

File: file.txt
  Size: 4030      	Blocks: 8          IO Block: 4096   regular file
Device: 801h/2049d	Inode: 13633379    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/   linuxize)   Gid: ( 1000/   linuxize)
Access: 2019-11-06 09:52:17.991979701 +0100
Modify: 2019-11-06 09:52:17.971979713 +0100
Change: 2019-11-06 09:52:17.971979713 +0100
 Birth: -

For inkremental backups you need the rsync-option --link-dest.

If you exclude files from backup you must run after restore this command.

sudo -u www-data php occ files:scan --all

But I don’t know if it will work for the trash and/or versions.
Read this documentation.

I think you can also configure e.g. trash and versions in Nextcloud to minimize it automatically from Nextcloud. Then you must not exclude it from backup/restore.

Let me clarify: I’m not running the exact rsync command to backup my data. Yet there is a lot of work/temporary data within the accounts which doesn’t need to go to the backup.

My idea is to work with a “archive” folder which is included within the backup and all the rest of the folders is not. But I am concerned if this is possible/feasible to do and be able to restore the cloud in case of a crash.