NexcloudPi crashes a couple of times week, Your data directory is invalid

Its a brand new drive.
Can there be some sort of power save on the drive?
I tried to unplug and replug the drive last time but nextcloud didnt find the files.

Is there some command I can use to try to re mount the drive?

It also only happens after I upload new files

Den sön 14 maj 2023 09:50Oliver_Van via Nextcloud community <noreply@nextcloud.com> skrev:

Not a good idea, rather use [this howto] to mount a drive permanently (Mounting an external drive using UUID and fstab | Linux admin junior)

Power save, could slow things down, but should just power back on when uploading files.

To check if and where drives are mounted I use:

sudo df -hT

In Linux to mount a drive one uses

sudo mount /dev/sda1 /media/drivename

But you enabled automount so should not need to manually mount unless you add more drives to your system.

Yesterday I uploaded som more date and after some time teh server crashed again with teh same error.
The command sudo df -hT gave me the following result:

Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/root      ext4       29G  3.2G   25G  12% /
devtmpfs       devtmpfs  1.7G     0  1.7G   0% /dev
tmpfs          tmpfs     1.9G     0  1.9G   0% /dev/shm
tmpfs          tmpfs     759M  1.2M  758M   1% /run
tmpfs          tmpfs     5.0M  4.0K  5.0M   1% /run/lock
/dev/mmcblk0p1 vfat      255M   31M  225M  13% /boot
tmpfs          tmpfs     380M     0  380M   0% /run/user/1000

My external drive isnt there, it is 14TB

After a reboot it looked like this

Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/root      ext4       29G  3.2G   25G  12% /
devtmpfs       devtmpfs  1.7G     0  1.7G   0% /dev
tmpfs          tmpfs     1.9G     0  1.9G   0% /dev/shm
tmpfs          tmpfs     759M  1.2M  758M   1% /run
tmpfs          tmpfs     5.0M  4.0K  5.0M   1% /run/lock
/dev/mmcblk0p1 vfat      255M   31M  225M  13% /boot
/dev/sda1      btrfs      15T   17G   15T   1% /media/myCloudDrive
tmpfs          tmpfs     380M     0  380M   0% /run/user/1000

Son now the external drive is there. So for some reason the external drive is dropped after new data is added to nextcloud

I would look at the system logs in /var/log just after the event.
-rw-r----- 1 root adm 2,1M May 17 10:34 syslog
-rw-r----- 1 root adm 1,5M May 17 10:34 messages
-rw-r----- 1 root adm 1,5M May 17 10:34 kern.log

Run from terminal:

sudo cat /var/log/syslog

And look for error messages

I guess its better to do this after a crash but now when everything is fine i found one error message:
Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1

Didnt take much time for another crash, this time witk no activity in Nextcloud at all

Log here

So it crashed again tonight.
I tried “sudo mount /dev/sda1/media/myCloudDrive”
And got can’t find in /etc/fstab.

How is that possible?

May we have a cat of your fstab file?
cat /etc/fstab

PARTUUID=9d2f1b43-01 /boot vfat defaults 0 2
PARTUUID=9d2f1b43-02 / ext4 defaults,noatime 0 1

a swapfile is not a swap partition, no line here

use dphys-swapfile swap[on|off] for that

Now everything is online

I am a little out of my depth here.
At the same time, I probably would try to add the mount permanently in the fstab file, because for now, it isn’t.

/dev/sda1 /media/myCloudDrive ext4 defaults

Sounds resonable since it fails all the time.
Shouldnt that be done automatically via automount?

How do I do that?

Is this correct

PARTUUID=9d2f1b43-01 /boot vfat defaults 0 2
PARTUUID=9d2f1b43-02 / ext4 defaults,noatime 0 1
/dev/sda1 /media/myCloudDrive ext4 defaults

a swapfile is not a swap partition, no line here

use dphys-swapfile swap[on|off] for that

Or do I need uuid instead?

I tried to figure that out with:
blkid myCloudDrive
But got no result

You should disable automount if you decide to use uuid, can follow Mounting an external drive using UUID and fstab | Linux admin junior instructions.

It is more important however to find out why automount is failing, rather then changing the method of mounting the drive. Have you tried checking the connections and changing the cable?

Automount should work fine, unless you want more control over drive mounting or need to add more then one drive.

It is a brand new 14TB drive so I dont think I will need to add more storage for a while :slight_smile:
Bu I have tried with another smaller drive before and got the same trouble then as well. Also tried 3 different Pi. The only thing I havent tried is just juse a big sd card as storage but that is not enough storage so thats no use.

Do you have any idea why the automout cails?

As mentioned earlier, I would look at these logs for clues. If you share the content of them via pastebin or similar, others can have a look too.

I would also try another cable. A new drive with a modern usb3 cable/plug, might not work as one expects, on an older pi with usb2 ports. I have stopped using rpi’s for server purposes completely, they were fun for learning and playing around with.

Here is a link to pastebin with the log.

This seems a bit fishy

  1. May 17 15:52:38 nextcloudpi kernel: [111064.372987] xhci_hcd 0000:01:00.0: WARNING: Host System Error

  2. May 17 15:52:43 nextcloudpi kernel: [111069.509949] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command

  3. May 17 15:52:43 nextcloudpi kernel: [111069.510004] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead

  4. May 17 15:52:43 nextcloudpi kernel: [111069.510082] xhci_hcd 0000:01:00.0: HC died; cleaning up

  5. May 17 15:52:43 nextcloudpi kernel: [111069.510124] usb 1-1: USB disconnect, device number 2

  6. May 17 15:52:43 nextcloudpi kernel: [111069.510147] usb 1-1.1: USB disconnect, device number 3

  7. May 17 15:52:43 nextcloudpi kernel: [111069.511149] usb 2-1: USB disconnect, device number 2

  8. May 17 15:52:43 nextcloudpi kernel: [111069.511176] usb 2-1.1: USB disconnect, device number 3

  9. May 17 15:52:43 nextcloudpi kernel: [111069.511678] sd 0:0:0:0: [sda] tag#2 uas_zap_pending 0 uas-tag 1 inflight:

  10. May 17 15:52:43 nextcloudpi kernel: [111069.511697] sd 0:0:0:0: [sda] tag#2 CDB: opcode=0x88 88 00 00 00 00 00 00 01 84 00 00 00 00 20 00 00

  11. May 17 15:52:43 nextcloudpi kernel: [111069.525987] device offline error, dev sda, sector 99328 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 2

Which one is this?

Agree, usb3 seems to be the culprit, see :smiling_face_with_three_hearts: How I fixed xHCI host controller not responding, assume dead - TechOverflow

That is a log after my USB drive got disconnected.
So you should add some code that restarts the usb interface every night?

Sorry you’re experiencing this issue. I have several Raspberry Pi “servers” including a Pi 4 that was my original production home NC host (it isn’t now but it still has some test NC containers on it and hosts other stuff now). It too has an external USB drive for data storage.

You are having a hardware issue with your Pi. It may not be a defect, but could be an issue with power draw, etc. due to the combination of specific components. One of the most notorious issues with the Pi is the sensitivity to power draw on its USB ports. I can easily take mine offline inadvertently by connecting two less hefty SSD SATA drives sitting in unpowered USB enclosures. A 14TB spinning HDD like you have probably exceeds Pi specs easily during heavy read/write operations and just starting up.

E.g. Raspberry Pi 4B experiencing what I assume to be power issues - Raspberry Pi Forums

You probably should ignore the NC issues that keep arising and focus on reproducing and resolving the crashing of your external drive. NC expects reliable underlying compute/storage/OS.

My guess is that you could reproduce the issue by simply doing some massive file copying/creation based on the prior bits in this thread.

I would suggest getting some assistance over at https://forums.raspberrypi.com/ and when things seem stable hardware-wise to pop back here so we can help you make sure your NC installation is all tuned up. :slight_smile:

The USB drive has its own power supply so that is not the issue either.