Btrfs failure on data device

Support intro

Sorry to hear you’re facing problems :slightly_frowning_face:

help.nextcloud.com is for home/non-enterprise users. If you’re running a business, paid support can be accessed via portal.nextcloud.com where we can ensure your business keeps running smoothly.

In order to help you as quickly as possible, before clicking Create Topic please provide as much of the below as you can. Feel free to use a pastebin service for logs, otherwise either indent short log examples with four spaces:

example

Or for longer, use three backticks above and below the code snippet:

longer
example
here

Some or all of the below information will be requested if it isn’t supplied; for fastest response please provide as much as you can :heart:

Nextcloud version (eg, 20.0.5): 20.0.7.1
Operating system and version (eg, Ubuntu 20.04): nextcloudpi v1.34.7 (Linux nextcloudpi 5.10.11-v7l+ #1399 SMP Thu Jan 28 12:09:48 GMT 2021 armv7l GNU/Linux)
Apache or nginx version (eg, Apache 2.4.25): 2.4.38
PHP version (eg, 7.4): 7.3.19-1~deb10u1

The issue you are facing:
This night during start of backup job my data drive ran into btrfs errors resulting in unmounting drive. Running a btrfs check on data drive reports errors.

Is this the first time you’ve seen this error? (Y/N):
I already had btrfs errors some months (about 7) ago. I removed USB Sticks and replaced devices with removeable HDD drives.

Steps to replicate it:

The output of your Nextcloud log in Admin > Logging:

Nexcloud log contains 13743 lines! So I'd rather upload it somewhere.

This ist part of syslog:

Feb 15 03:02:15 nextcloudpi systemd[1]: tmp-runc.SfSo5T.mount: Succeeded.
Feb 15 03:02:28 nextcloudpi kernel: [585807.325569] usb 2-2.1: Disable of device-initiated U1 failed.
Feb 15 03:02:28 nextcloudpi kernel: [585807.326339] usb 2-2.1: Disable of device-initiated U2 failed.
Feb 15 03:02:28 nextcloudpi kernel: [585807.424784] usb 2-2.1: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
Feb 15 03:02:28 nextcloudpi kernel: [585807.614680] usb 2-2.1: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
Feb 15 03:02:28 nextcloudpi kernel: [585807.649779] sd 0:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 cmd_age=30s
Feb 15 03:02:28 nextcloudpi kernel: [585807.649802] sd 0:0:0:0: [sdb] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 00 64 a8 e0 00 00 02 00 00 00
Feb 15 03:02:28 nextcloudpi kernel: [585807.649825] blk_update_request: I/O error, dev sdb, sector 6596832 op 0x0:(READ) flags 0x84700 phys_seg 4 prio class 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.633683] usb 2-2.1: USB disconnect, device number 3
Feb 15 03:02:33 nextcloudpi kernel: [585812.644450] sd 0:0:0:0: [sdb] Synchronizing SCSI cache
Feb 15 03:02:33 nextcloudpi kernel: [585812.664134] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Feb 15 03:02:33 nextcloudpi kernel: [585812.664155] blk_update_request: I/O error, dev sdb, sector 88064 op 0x0:(READ) flags 0x1000 phys_seg 12 prio class 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664179] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664197] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664212] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 3, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664257] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664320] BTRFS info (device sdb1): no csum found for inode 986 start 13692928
Feb 15 03:02:33 nextcloudpi kernel: [585812.664376] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 5, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664398] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 6, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664428] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 7, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664466] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 8, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664500] BTRFS info (device sdb1): no csum found for inode 986 start 13697024
Feb 15 03:02:33 nextcloudpi kernel: [585812.664521] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 9, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664542] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 10, flush 0, corrupt 0, gen 0
Feb 15 03:02:33 nextcloudpi kernel: [585812.664619] BTRFS info (device sdb1): no csum found for inode 986 start 13701120
Feb 15 03:02:33 nextcloudpi kernel: [585812.664725] BTRFS info (device sdb1): no csum found for inode 986 start 13705216
Feb 15 03:02:33 nextcloudpi kernel: [585812.664818] BTRFS info (device sdb1): no csum found for inode 986 start 13709312
Feb 15 03:02:33 nextcloudpi kernel: [585812.664909] BTRFS info (device sdb1): no csum found for inode 986 start 13713408
Feb 15 03:02:33 nextcloudpi kernel: [585812.664993] BTRFS info (device sdb1): no csum found for inode 986 start 13717504
Feb 15 03:02:33 nextcloudpi kernel: [585812.665073] BTRFS info (device sdb1): no csum found for inode 986 start 13721600
Feb 15 03:02:33 nextcloudpi kernel: [585812.665171] BTRFS info (device sdb1): no csum found for inode 986 start 13725696

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):

<?php
$CONFIG = array (
  'passwordsalt' => '...',
  'secret' => '....',
  'trusted_domains' =>
  array (
    0 => '31fqey8qafjrkyf6.myfritz.net',
    4 => 'nextcloudpi',
    3 => 'fe80::ad19:381c:811a:c095',
    1 => '192.168.10.33',
    2 => 'nextcloudpi.31fqey8qafjrkyf6.myfritz.net',
    11 => '2a01:c22:bc5b:7a00:ad19:381c:811a:c095',
  ),
  'datadirectory' => '/media/myCloudDrive1/ncdata',
  'dbtype' => 'mysql',
  'version' => '20.0.7.1',
  'overwrite.cli.url' => 'https://gnbgjqwujtenlico.myfritz.net/',
  'dbname' => 'nextcloud',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => 'ncadmin',
  'dbpassword' => '....',
  'installed' => true,
  'instanceid' => 'oclaqn5ofjmm',
  'memcache.local' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' =>
  array (
    'host' => '/var/run/redis/redis.sock',
    'port' => 0,
    'timeout' => 0.0,
    'password' => '......',
  ),
  'tempdirectory' => '/media/myCloudDrive1/ncdata/tmp',
  'mail_smtpmode' => 'smtp',
  'mail_from_address' => 'nextcloudpi',
  'mail_domain' => 'meine-dateien.info',
  'preview_max_x' => '2048',
  'preview_max_y' => '2048',
  'jpeg_quality' => '60',
  'overwriteprotocol' => 'https',
  'loglevel' => '2',
  'log_type' => 'file',
  'maintenance' => true,
  'theme' => '',
  'logfile' => '/media/myCloudDrive1/ncdata/nextcloud.log',
  'mail_sendmailmode' => 'smtp',
  'mail_smtphost' => 'mail.gmx.net',
  'mail_smtpport' => '465',
  'mail_smtpauth' => 1,
  'mail_smtpname' => '......',
  'mail_smtppassword' => '......',
  'mail_smtpauthtype' => 'LOGIN',
  'mail_smtpsecure' => 'ssl',
  'app_install_overwrite' =>
  array (
    0 => 'files_texteditor',
    1 => 'gallery',
  ),
  'data-fingerprint' => '......',
  'twofactor_enforced' => 'false',
  'twofactor_enforced_groups' =>
  array (
    0 => 'Standard',
  ),
  'twofactor_enforced_excluded_groups' =>
  array (
  ),
);

The output of your Apache/nginx/system log in /var/log/____:

PASTE HERE

Could try running

btrfs scrub start /moint/point

I followed the suggestions from the article How to recover a btrfs partition and ran the following commands with a check immediately after each, but none of them solved my problem:

  1. sudo btrfs rescue zero-log /dev/sdb1
  2. sudo btrfs scrub start -Bd /dev/sdb1
  3. sudo btrfs check --repair /dev/sdb1

So I restored the data from a snapshot copy with some (minor) losses. So far I have everything working again.

What worries me more is the fact that btrfs is supposed to be a modern filesystem. But it doesn’t seem to be very robust. At least I didn’t do anything out of the ordinary. Until 2am snapshots were still running every hour, only the backup at 3am went wrong with read errors. There was nothing else out of the ordinary either, no power interruption, no crash and my external hard drives are even connected to a “USB Fast Charging Hub” where you can charge multiple devices.

In the last 10 months this is now the second time I have had a btrfs problem . And that’s the worrying thing, especially since I only noticed the problem because I briefly check the results almost every day. After the first time, I directly replaced the USB sticks with external hard drives.

No filessystem can cope with failing hardware.
Did you check the SMART status of your drives?

smartctl -H reports OK but smartctl -a reports “Unavailable - device lacks SMART capability.”

smartctl 6.6 2017-11-05 r4594 [armv7l-linux-5.10.11-v7l+] (local build)
Copyright Š 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

smartctl 6.6 2017-11-05 r4594 [armv7l-linux-5.10.11-v7l+] (local build)
Copyright Š 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: TOSHIBA
Product: External USB 3.0
Revision: 5438
Compliance: SPC-4
User Capacity: 4,000,787,027,968 bytes [4.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Serial number: 20190621015873F
Device type: disk
Local Time is: Fri Feb 19 17:09:36 2021 CET
SMART support is: Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C

Error Counter logging not supported

Device does not support Self Test logging

I’d recommend to switch the drive (or enclosure) since it doesn’t support smart monitoring.

Thanks for your reply. I did a test with another USB portable drive and found that smartctl similarly returns

pi@nextcloudpi:~ $ sudo smartctl --info /dev/sdc1
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-5.10.17-v7l+] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: TOSHIBA
Product: MK5055GSX
User Capacity: 500,107,862,016 bytes [500 GB]
Logical block size: 512 bytes
Device type: disk
Local Time is: Wed Apr 28 19:17:00 2021 CEST
SMART support is: Unavailable - device lacks SMART capability.

whereas CrystalDiskInfo provides SMART Infos

Yes, that’s a limitation of smartctl for some drives on Linux, afaik.