Which files below `data/appdata_*` and `data/updater-*` need to be included into backup?

Currently, I put the entire Nextcloud installation (i.e. /var/www/nextcloud) into backup. I wonder if this is necessary of if subset of the files would be sufficient and would save some backup space (and bandwidth).

Question:

  • Is there some documentaion and if yes, where, which specifies which directories/files must be put into a backup and which directories/files can safely by skipped as those files are either source code, caches, logs or some other kind of data which can be re-created if necessary.
  • Which files must be put into backup, even though they can be recreated, because there are records in the SQL database which point to those files and which would become dangling if the SQL database was restored from a backup without the files to which the SQL records point to.

Last night, the backup took a long time (much more than usual). I noticed that the directory ./data/appdata_*/preview was the culprit as it had grown to 170G. As far as I understand the files in that directory are (re-)created by a nightly cron job via /usr/bin/php -f /var/www/nextcloud/occ preview:pre-generate --no-interaction --no-ansi. Hence, I assume that it would be OK to exclude that directory from backup. It should be sufficient to only backup the actual files which somewhere reside inside the user directories below ./data/.

I also noted that ./data/updater-*/{backups,downloads} contains the backups and downloads, resp., of former Nextcloud upgrades. It seems that this can also safely excluded from backup. However, I am not sure whether there are SQL records that point to those preview files.

However, there are other directories inside ./data/appdata_*/ I am not sure about, e.g. calendar, identityproof, mail, etc.

1 Like

https://docs.nextcloud.com/server/stable/admin_manual/maintenance/backup.html

With all this, you just need the Nextcloud code itself and then you can quickly restore your setup exactly like it was. Not all information within is really unique as you already pointed out (e.g. previews).

Yes, I think they are mainly interesting during the upgrade, the download folder is deleted in my case, the code backup I manually delete from time to time (also makes no sense to occupy space on my system).

You can move the logs outside if you don’t want them to be saved. With logrotation, you can keep the logs small and e.g. in case you were hacked, you might have a few logs to go back and perhaps be able to analyze the incident.

https://docs.nextcloud.com/server/stable/developer_manual/basics/storage/appdata.html
It is supposed to not be user-specific data, not sure if it might contain some configuration data. There you really need to check for the apps that are important for you.

1 Like

Well, I know that part of the documentation, but only says

To backup a Nextcloud installation there are four main things you need to retain:

  1. The config folder
  2. The data folder
  3. The theme folder
  4. The database

Hence, it refers to the entire data-folder which seems to be too much as it also includes the appdata and updater sub-directories. The question was rather “what do I really need”.

Yes and no. The documentation doesn’t say the data isn’t user-specific. The documentation says, Nextcloud Application can use AppData to store information which does not belong into the user’s file directory. That are two completely different pairs of shoes.

For example, consider calendar entries. Calender entries are user specific, but naturally a user doesn’t want to see individual calendar entries as files inside the user’s home directory.

Well that is unfortunate. I really hoped that the directory layout of Nextcloud would allow to distinguish between backup-worthy data and temporary data more cleanly on the top-level. IMHO, it would be much better to have something like ./data and ./cache on the top-level and then each App would be responsible to store information either inside ./data/appdata/\<app-name\>/ or ./cache/\<app-name>/.

But checking with each App which files needs to be backed up inside a single ./data/appdata/\<app-name\> which contains both persistent and cached information isn’t the road I want to go down. In particular, as additioanl Apps can be in installed or Apps may change their internal storage layout.

It is not exclusive, that personal data must not be stored in this folder.

They are in the database.

From the docs, I’d assume that the appdata-folder is just of temporary character. So I’d check with the apps I use, if they respect it. If that works for most of them, it might be a good point for the others to follow (if it is not already the case). And if there is really a need for permanent app storage, you could then make a legitimate request for: