Backups strategy question using Duplicati

Hi people,

I’ve been using Nextcloud for a few years at this point, and it’s starting to take a lot of space. My issue is just backing it up every night is starting to take a stupid amount of time, I’m up to ~8 hours, not because there’s that many changes but just because of scanning the hundreds of thousands of files.
I’m using unraid which is fast enough for normal use, but is dog slow when scanning large amounts of data like this.

For example appdata_XXXXXX/preview takes hours to process through. I believe those are just thumbnails, is that safe to exclude from the backups entirely ? Presumably they’d be re-generated from the real file if missing ?

Is there some way to do backups “smart”, either by using the database to figure out what files changed / need to be uploaded or even continuously backing nextcloud, by having an async write up to an object storage on every local write ? Any apps I may have missed for something like that, or other tips ?

Thanks !

First you can exclude data/updater-*/,backups

I do exclude data/appdata_* too, as this one you can recreate with apps.
I use rectic this will even do smaller backups by finding duplicates.

I already delete the updater stuff so that’s not backed up.

I use duplicati which also does deduplication, but to do so it still needs to scan the files and there’s just so many of them. My issue isn’t the size of the backup, just the time it takes for the whole cycle to complete.

Good to know I can exclude appdata_*, that’s most of the files right there so that’ll help ! Thanks

This is not a good idea. There is some persistant data in it that you should get backed up as well.

Have you considered incremental backups? I do this and it takes seconds to minutes on a directly connected disk.

1 Like

They are incremental, you still need to scan every file to know if it exists in the backup or not

Using what tool? Please be very specific on exactly how you handle backups.

As mentioned, Duplicati

1 Like

Have you researched through Duplicati documentation and their forum? https://forum.duplicati.com/
That would be the place to learn more about improving efficiency with their tool.

My suggestion is to ask this question there and link this post over.