How do you use multiple HDDs in Nextcloud?

Hello everyone,
just to say I am new to Nextcloud and I am just figuring things out. Also, a little bit about what I have set up and what I intend to use it for. I have used an old computer to run Nextcloud on to create a private server for me and my family to use as a cloud.

Basically, my question is how does Nextcloud handle multiple HDDs? I have tried to access one through /media/the HDD/ but a little red square pops up next to it on the external storage tab. Is there a way of getting Nextcloud to store to my other HDDs automatically? Can the server data file be spread across multiple locations?

Nextcloud version: 11.0.0
Operating system: Ubuntu 16.04 LTS
Apache: Latest I believe (did a new install yesterday)
PHP version: 5.6 (installed fresh yesterday)

Thank you for your time.

Dunno about Nextcloud as I am a relative noob.
With Ubuntu the default install is LVM which is a doodle to add extra disks that for the OS look like a singular disk.
Its LVM2 now and really I should do some reading but I am sure LVM is now mature and has all the tools to make it very easy.

I am still old school and have software raid enable where I could grow the RAID10 array mounted at /var at anytime.
LVM is actually probably easier and you can just simply attach another disk at any time, even remote storage.

Or the nextcloud way.

Thank you for your reply.
With your RAID10 array you say you can grow it , so does that mean you can just add to it if I understand correctly.
The LVM looks interesting. I shall have a look now.

Yeah I would have to add two drives I think, dunno its first time I have used Raid 10 usually Raid5.
Just had 4 disks and did it because I hadn’t before :slight_smile:)

\root is on 2 drives mirrored with \var on 4 as I had a glut of old drives after some updates to SSD.

It is possible to extend a RAID10 across additional PVs, but they must be added in multiples of the original RAID10

Raid5 might of been an easier choice, but actually the LVM schemes can mirror and stripe, the original just wasn’t as well supported as MD but I am presuming LVM2 has all areas covered and would be much easier.

Or just do it through NextCloud and the external storage app.
I am a bit out of the loop now but Btrfs does very much the same also and can scale to ridiculous sizes.

yes I have recently extended the cloud storage with a extra harddisk.

it is very simple to extend the cloud storage space although I do not use raid.

say you have the NC data stored on /cloud and it is running full.
sudo apt install mhddfs
make a new directory /cloud1
make a new directory /cloud2
set the permissions user group to www-data:www-data

change your fstab to have the following
mount your new harddisk to /cloud2
change the /cloud mountpoint to /cloud1
and add some thing like
#mhddfs
mhddfs#/cloud1/,/cloud2/ /cloud fuse allow_other,logfile=/var/log/mhddfs.log 0 0

my /etc/fstab is like this

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sdc1 during installation
UUID=3906263b-190b-462a-9968-9977ff29f13a /               ext4    errors=remount-ro 0       1
# /home was on /dev/sdc3 during installation
UUID=bf9638ab-2991-49d0-aa30-cb4c3c80190b /home           ext4    defaults        0       2
# swap was on /dev/sdc5 during installation
UUID=ca0860b5-f5ae-4d27-9302-2d30f85a47df none            swap    sw              0       0

# /cloud was on /dev/sdc4 during installation
# Changed the line below to /cloud1
UUID=aca04ccb-b3d7-41a5-9f2b-fb4931042f0c /cloud1          ext4    defaults        0       2
# manually added the lines below 
# /cloud2 is created on /dev/sdd1 after installation
UUID=0523a47c-8a21-44e1-9111-47fd371e6414 /cloud2	ext4	defaults	0	2
#mhddfs
mhddfs#/cloud1/,/cloud2/ /cloud		fuse	  allow_other,nonempty,logfile=/var/log/mhddfs.log 0 0

save it
sudo unmount /cloud
sudo mount -a

this should result in one directory /cloud with the contents of /cloud1 and /cloud2
you can then easily add more disk by creating a cloud3 directory and add it to your fstab.

mhddfs#/cloud1/,/cloud2/,/cloud3/ /cloud fuse defaults,allow_other,nonempty 0 0

Have NC data directory set to /cloud

hope this was any help.

edit:
as i use this only for myself this is my output of df -h
Bestandssysteem Grootte Gebruikt Besch Geb% Aangekoppeld op
/cloud1/;/cloud2/ 284G 11G 259G 4% /cloud
/dev/sdd1 184G 60M 174G 1% /cloud2
/dev/sdc4 101G 11G 85G 12% /cloud1

but there are examples out there over 18Tb

2 Likes

Raid5 might of been an easier choice

Maybe not in these days anymore, if one disk goes out it is like praying that no other disk gets defective while rebuilding. Better to use Raid6…

I tend to take that with a pinch of salt as its very dependent on the age of your drives.
If you are running ancient drives that can be expected to fail then maybe.
If I am setting up my own with disks I have purchased and 1 of those 4 dies I am going to be pretty pissed and it will be a warranty issue.
If another one dies whilst rebuilding I am going to be extremely pissed and never buy from that manufacturer again.

Raid 10 loses a lot of space over 4 disks and that is where I am talking, not a data center where two parity disks doesn’t count as much in array.
Current RAID5 requires the whole array to sync and in large arrays this can be problematic, with 4 space is more of a consideration.

I am hoping BTRFS gets RAID5 into production as if they iron out a few bugs the argument against RAID5 is false.

I recently got several second hand 680GB WD blacks and they will be a 4x RAID5 2TB array rather than 1.3 and if the 2nd pops when re-syncing its just some rather shitty bad luck, even with there age.

Sure it depends on the situation, but even new disks can fail and if one fails then no other disk should fail (hence the recommendation for different batches of disks) because only a backup will help in such a case (hence “Raid is not a backup”).

For modern filesystems like BTRFS and ZFS with all their checksumming in place this might help to overcome the problem, but with conventional Raid controllers or simple MDADM the problem still exists. Also using disks certified for 24/7 operation helps to reduce outages.

Yeah I agree but in the context of what we are talking here where a resync might take a couple of hours, that I have purchased disks where firstly one fails and I don’t have trust that the rest can last a couple of hours begs a question of what dross I have purchased and been supplied.

The RAID 5 argument comes from large arrays due to extended resync times where actually the statistical chance could be odds on.
With 4 disks the MTBF and the length of time the resync should last means the statistical chance is low.

BTRFS splits disks into chunks and the working RAID modes are extremely flexible as you can do weird stuff like RAID10 with odd disk numbers or disks of different size.
With BTRFS RAID5 its the same and because the parity is split into chunks and shared across the array the resync is only the missing disk and not the whole array.
If they could iron out a few bugs then for small arrays BTRFS RAID5 could be a preference choice.

Its not really about new disks or failure, its about context of use and the MTBF of large arrays with RAID5 is true due to resync length.
In the context of what I am taking about its not true and also they are still connected to many singular sources of failure , where its always a matter of cost/time that is fit for purpose.

Personally I don’t use conventional Raid controllers as often it means you need a backup of the exact model as they are often proprietary in stripe format, MDADM for smaller systems in this context is a far better option, as those disks can go in another machine.

If you want to talk out of this context about large arrays, then yes a totally different approach would be wise.

Can you please help me with this? Your instructions are a bit advanced for me.

I just set up an External Storage addon on Nextcloud, is it fine to use it? I mean if my primary space finishes, will it automatically start storing on the external storage? And will it have less processing speed or something? Also, do I have to set anything else too? Apart from adding the external storage?

Reading through the documentation it appears you cannot use this as you intent. You will have to assign the external storage to a user or group.
Like explained in post 2

https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html

I highly recommend to setup lvm (logic volume management) which will require a full backup of you harddisk and reformat to the appropriate partition type.

Lvm is a virtual partition that takes multiple storage devices and creates one working partition of the size of all storage devices combined.

I would suggest reading one of the many tutorials about lvm

Hint. You only need lvm for you cloud storage not your entire system. (Your choice)
If I were you I would take a clean/empty hdd and set it up as lvm then add it as your primary cloud storage, copy all previous data from the old cloud storage then convert the old storage to lvm and add it.

I’m not much experienced like you, if I do one mistake, the whole server will fail and I can’t do anything since Backups are not an option in my server(it’s paid).

Can you please see this query of mine, it will take a minute - I have multiple drives in the server but it only stores in the / one. How can i use the remaining two drives to save it?

I tried but fail at the very first step as nothing returns on typing pvs, vgs and lvs.

This comes from fdisk -l

If this is your complete output of fdisk -l then you only have 2 disks that are running (probably) raid 1. This means both disks are already in use.

What is the output of cat /proc/mdstat

Here you go -

I got around this at the OS level using Mergerfs to make two directories look like one. The two directories can each be on any disks you want.

I moved the original Nextcloud data directory to a new name, then used Mergerfs to create a merged directory with the original name. The files are then spread over the two directories.

Although the above link refers to Arch linux, I was actually doing it on Rasbpian, so I think it should work on your setup too. These are just good instructions.

NB: I did have a problem with systemd, because apache was starting before Mergefs. This resulted in Apache writing a small file into the merged directory on boot which appears empty before Mergefs starts. This then stopped Mergefs from running on the directory, because it needs it to be empty. The solution was to make sure systemd started Mergefs before Apache.