Content wiped out from database after storage unmounted

jbumnextcloud · August 12, 2020, 6:04am

I have the following setup.

Nextcloud 19 (docker image) on a Raspbian Buster.
RAID1 via mdadm using two 4TB hard drives.
RAID storage added as a local external storage in Nextcloud.
Nextcloud desktop client which is syncing from my machine to NextCloud.

I had content synced into the external storage folder and everything was working fine. I was messing with the nextcloud server system and rebooted it. When the system was back online, the RAID disk/space was not mounted and available.

I believe the desktop client app was running during this time and deleted my local files because Nextcloud thought its copies were removed. What it thought were deletions were synced back to the desktop client.

Is it expected that Nextcloud does not understand the difference between an external storage not being mounted versus the contents contained within having been removed? Does it require deletion events or just the mere fact that the files no longer exist where it expects them on the server?

Well, the server files were never lost since the RAID device was not mounted in the first place when this happen. However, the Nextcloud database now had no evidence of those files having existed and even with the RAID mounted, the web interface showed an empty folder.

I had to run “files:scan” and then resync everything back down to the local desktop machine. Quite the scare.

How do I avoid this situation in the future? Is there some configuration available to prevent this from happening?

Thanks.

gas85 · August 12, 2020, 8:07am

In a .../index.php/settings/admin/externalstorages you can set if your RAID should be rescanned by access or never. I think you can select never and simply added cron job to rescan externals only periodically. Ether by command:

sudo -u www-data occ files:scan --path user_id/files/RiadFolder

Or, if you have more external storage’s, simply use

Also you can added a line that will check before if storage is mounted.

I never had problem when external storage is unavailable - client will not wipe data out. But it is different for Local storage. If you point Nextcloud to the root mounting point, then from the Nextcloud view all data being wiped when you not mount your RAID.

E.g. your mounting point is/mnt/raid and you added /mnt/raid as external storage to the nextcloud. In this case when nothing mounted, /mnt/raid is still exist as empty folder and Nextcloud will, from my point of view, correctly think that no data exist anymore and delete it when rescanned.

If you added subfolder (e.g. /mnt/raid/share) on the raid to the nextcloud as external storage, then it should work. If RAID mount on /mnt/raid fails, then no sub-folder /mnt/raid/share exist, nextcloud will give an error and will not wipe your data out. If RAID mounted, then Sub-folder will be accessible. This is my theory

jbumnextcloud · August 12, 2020, 10:36pm

gas85,

Really appreciate your response. I think your theory makes perfect sense and I think that is what happened based on my observation. My mount point is indeed /mnt/raid1 and for some reason I also have subfolders under it (e.g. /mnt/raid1/pictures, /mnt/raid1/documents) - I don’t recall creating them but maybe I did? The subfolders are exactly the same ones I have in the actual raid device. I will remove them when raid is not mounted to avoid this situation.

I had issues running your script but made an update that worked for me. I had to change the following from

if [ LogFilePath = “” ]; then

to

if [ “$LogFilePath” = “” ]; then

I’ll need to rebuild my custom docker image to get this into www-data’s cron job but that should be the easy part now with your help.

Thanks!

gas85 · August 13, 2020, 1:56pm

I thinks you can easily added

docker exec --user www-data CONTAINER_ID

in front of each line starting with $PHP $COMMAND.
And delete few lines. It could be like this, but I never test it:

#!/bin/bash

# By Georgiy Sitnikov.
#
# Will do external ONLY shares rescan for nextcloud and put execution information in NC log.
# If you would like to perform WHOLE nextcloud rescan, please add --all to command, e.g.:
# ./nextcloud-file-sync.sh --all
#
# AS-IS without any warranty

# Adjust to your NC installation
	# Your NC OCC Command path
COMMAND=occ
	# Your NC log file path
ConfigDirectory=""
DataDirectory=""
LOGFILE=""

CONTAINER_ID="nextcloud"

CRONLOGFILE=/var/log/next-cron.log
	# If you want to perform cache cleanup, please change CACHE value to 1
CACHE=0
	# Your PHP location
PHP=php

. /etc/nextcloud-scripts-config.conf

# Live it like this
OPTIONS="files:scan"
LOCKFILE=/tmp/nextcloud_file_scan
KEY="$1"
SECONDS=0

if [ -f "$LOCKFILE" ]; then
	# Remove lock file if script fails last time and did not run longer than 10 days due to lock file.
	find "$LOCKFILE" -mtime +10 -type f -delete
	exit 1
fi

# Put output to Logfile and Errors to Lockfile as per https://stackoverflow.com/questions/18460186/writing-outputs-to-log-file-and-console
#exec 3>&1 1>>${CRONLOGFILE} 2>>${CRONLOGFILE}

touch $LOCKFILE

reqId=$(< /dev/urandom tr -dc A-Za-z0-9 | head -c20)

echo \{\"reqId\":\"$reqId\",\"app\":\"$COMMAND $OPTIONS\",\"message\":\""+++ Starting Cron Filescan +++"\",\"level\":1,\"time\":\"`date "+%Y-%m-%dT%H:%M:%S%:z"`\"\} >> $LOGFILE

date >> $CRONLOGFILE

# scan all files of selected users
#$PHP $COMMAND $OPTIONS [user_id] >> $CRONLOGFILE
# e.g. php $COMMAND $OPTIONS user1 >> $CRONLOGFILE


# scan all EXTERNAL files of selected users
# how to get mounted externals? --> sudo -u www-data php occ files_external:list | awk -F'|' '{print $8 $3}'
#sudo -u www-data php occ files_external:list | awk -F'|' '{print $8"/files"$3}'| tail -n +4 | head -n -1 | awk '{gsub(/ /, "", $0); print}'
# "user_id/files/path"
#   or
# "user_id/files/mount_name"
#   or
# "user_id/files/mount_name/path"

if [ "$KEY" != "--all" ]; then
	# get ALL external mounting points and users
	docker exec --user www-data $CONTAINER_ID $PHP $COMMAND files_external:list | awk -F'|' '{print $8"/files"$3}'| tail -n +4 | head -n -1 | awk '{gsub(/ /, "", $0); print}' > $LOCKFILE
		
	# rescan all shares
	cat $LOCKFILE | while read line ; do docker exec --user www-data $CONTAINER_ID $PHP $COMMAND $OPTIONS --path="$line"; done
else
	# scan all files of all users (Takes ages)
	if [ "$KEY" == "--all" ]; then
		docker exec --user www-data $CONTAINER_ID $PHP $COMMAND $OPTIONS --all
	fi
fi

duration=$SECONDS

echo \{\"reqId\":\"$reqId\",\"app\":\"$COMMAND $OPTIONS\",\"message\":\""+++ Cron Filescan Completed. Execution time: $(($duration / 60)) minutes and $(($duration % 60)) seconds +++"\",\"level\":1,\"time\":\"`date "+%Y-%m-%dT%H:%M:%S%:z"`\"\} >> $LOGFILE

# OPTIONAL
### Start Cache cleanup

if [ "$CACHE" -eq "1" ]; then
	echo \{\"reqId\":\"$reqId\",\"app\":\"$COMMAND $OPTIONS\",\"message\":\""+++ Starting Cron Files Cache cleanup +++"\",\"level\":1,\"time\":\"`date "+%Y-%m-%dT%H:%M:%S%:z"`\"\} >> $LOGFILE
	SECONDS=0
	date >> $CRONLOGFILE
	docker exec --user www-data $CONTAINER_ID $PHP $COMMAND files:cleanup >> $CRONLOGFILE
	duration=$SECONDS
	echo \{\"reqId\":\"$reqId\",\"app\":\"$COMMAND $OPTIONS\",\"message\":\""+++ Cron Files Cache cleanup Completed. Execution time: $(($duration / 60)) minutes and $(($duration % 60)) seconds +++"\",\"level\":1,\"time\":\"`date "+%Y-%m-%dT%H:%M:%S%:z"`\"\} >> $LOGFILE
fi
### FINISCH Cache cleanup

rm $LOCKFILE

exit 0

jbumnextcloud · August 13, 2020, 4:03pm

Thanks. I was able to add a cron task to run the script (as www-data) in the container so it seems to be working now. I was able to test it and print out the list of local external storages it detected.

nfp0 · January 19, 2023, 7:03pm

I have stumbled upon this problem as well.

A faulty USB cable caused my Nextcloud instance to silently remove a lot of files from the database just because of I/O errors, even thought the files remained present on the data folder.

I’ve opened an issue and described the problem here:

github.com/nextcloud/server

[Bug]: Files are silently removed from database when storage fails

opened 06:49PM - 19 Jan 23 UTC

nfp0

bug 0. Needs triage 25-feedback

### ⚠️ This issue respects the following points: ⚠️ - [X] This is a **bug**, …not a question or a configuration/webserver/proxy issue. - [X] This issue is **not** already reported on Github _(I've searched it)_. - [X] Nextcloud Server **is** up to date. See [Maintenance and Release Schedule](https://github.com/nextcloud/server/wiki/Maintenance-and-Release-Schedule) for supported versions. - [X] Nextcloud Server **is** running on 64bit capable CPU, PHP and OS. - [X] I agree to follow Nextcloud's [Code of Conduct](https://nextcloud.com/contribute/code-of-conduct/). ### Bug description When trying to access a file on Nextcloud but, for some reason, the underlying storage device fails with an I/O error, Nextcloud removes the file from the database silently. For an unsuspecting user, after solving the storage device problem and restarting Nextcloud, it will find said file has disappeared even though it still exists on the Nextcloud data directory. This happens silently, so a user might find out months later that many of his files have mysteriously vanished. The files can be added back by re-scanning the data directory, but that is assuming the user ever finds out that these files have been disappearing. There are no Warning or Error logs about this either. I could only trigger this bug by downloading big folders with multiple files through the web interface and having a storage I/O error during the download, but there might be other ways to trigger it. The behavior seems similar to the one described here: https://help.nextcloud.com/t/content-wiped-out-from-database-after-storage-unmounted/89534 ### Steps to reproduce I'll describe how I can trigger the bug consistently on an external HDD, but there might be other ways to trigger it. 1. Have Nextcloud with the data directory on an external USB HDD. 2. Start the download of a big folder with multiple files through the web interface. 3. Pull the USB cable to simulate a storage failure. 4. Re-plug the cable, re-mount the storage, restart Nextcloud and refresh the web page. 5. Notice how at least one file has vanished from the folder you were trying to download (possibly the file that was being accessed at the instant you unplugged the HDD). ### Expected behavior Nexcloud should provide an error message in the log, but should never assume a file has been deleted just because of an I/O error. ### Installation method Community Docker image ### Operating system Other ### PHP engine version PHP 8.1 ### Web server Apache (supported) ### Database engine version MariaDB ### Is this bug present after an update or on a fresh install? Fresh Nextcloud Server install ### Are you using the Nextcloud Server Encryption module? None ### What user-backends are you using? - [x] Default user-backend _(database)_ - [ ] LDAP/ Active Directory - [ ] SSO - SAML - [ ] Other ### Configuration report ```shell { "system": { "htaccess.RewriteBase": "\/", "memcache.local": "\\OC\\Memcache\\APCu", "apps_paths": [ { "path": "\/var\/www\/html\/apps", "url": "\/apps", "writable": false }, { "path": "\/var\/www\/html\/custom_apps", "url": "\/custom_apps", "writable": true } ], "memcache.distributed": "\\OC\\Memcache\\Redis", "memcache.locking": "\\OC\\Memcache\\Redis", "redis": { "host": "***REMOVED SENSITIVE VALUE***", "password": "***REMOVED SENSITIVE VALUE***", "port": 6379 }, "instanceid": "***REMOVED SENSITIVE VALUE***", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "nextcloud.example.com" ], "datadirectory": "***REMOVED SENSITIVE VALUE***", "dbtype": "mysql", "version": "25.0.3.2", "overwrite.cli.url": "https:\/\/nextcloud.example.com", "overwriteprotocol": "https", "dbname": "***REMOVED SENSITIVE VALUE***", "dbhost": "***REMOVED SENSITIVE VALUE***", "dbport": "", "dbtableprefix": "oc_", "mysql.utf8mb4": true, "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "default_phone_region": "PT", "mail_from_address": "***REMOVED SENSITIVE VALUE***", "mail_smtpmode": "smtp", "mail_sendmailmode": "smtp", "mail_domain": "***REMOVED SENSITIVE VALUE***", "mail_smtpsecure": "tls", "mail_smtpauthtype": "LOGIN", "mail_smtphost": "***REMOVED SENSITIVE VALUE***", "mail_smtpport": "587", "mail_smtpauth": 1, "mail_smtpname": "***REMOVED SENSITIVE VALUE***", "mail_smtppassword": "***REMOVED SENSITIVE VALUE***", "loglevel": 2, "maintenance": false, "theme": "" } } ``` ### List of activated Apps ```shell Enabled: - activity: 2.17.0 - circles: 25.0.0 - cloud_federation_api: 1.8.0 - comments: 1.15.0 - contactsinteraction: 1.6.0 - dashboard: 7.5.0 - dav: 1.24.0 - federatedfilesharing: 1.15.0 - federation: 1.15.0 - files: 1.20.1 - files_pdfviewer: 2.6.0 - files_rightclick: 1.4.0 - files_sharing: 1.17.0 - files_trashbin: 1.15.0 - files_versions: 1.18.0 - firstrunwizard: 2.14.0 - logreader: 2.10.0 - lookup_server_connector: 1.13.0 - nextcloud_announcements: 1.14.0 - notifications: 2.13.1 - oauth2: 1.13.0 - password_policy: 1.15.0 - photos: 2.0.1 - privacy: 1.9.0 - provisioning_api: 1.15.0 - recommendations: 1.4.0 - related_resources: 1.0.4 - serverinfo: 1.15.0 - settings: 1.7.0 - sharebymail: 1.15.0 - support: 1.8.0 - survey_client: 1.13.0 - systemtags: 1.15.0 - text: 3.6.0 - theming: 2.0.1 - twofactor_backupcodes: 1.14.0 - twofactor_totp: 7.0.0 - updatenotification: 1.15.0 - user_status: 1.5.0 - viewer: 1.9.0 - weather_status: 1.5.0 - workflowengine: 2.7.0 Disabled: - admin_audit - bruteforcesettings - encryption - files_external - suspicious_login - user_ldap ``` ### Nextcloud Signing status ```shell No errors have been found. ``` ### Nextcloud Logs ```shell No logs are printed when triggering this bug. ``` ### Additional info OS: Arch Linux