Sync 500000 files from an external USB Drive

I downloaded 1.3TB of data from Dropbox using their sync app to an external USB disk. It took 24 hours.

Now I’m attempting to sync it to Nextcloud via the Desktop sync app on Windows. ETA is 10 days.

This is the behaviour:

It scans all of the files creates a db file. That takes a while.
Then Nextcloud freezes for a while before it starts to upload.
Of course files are all different sizes from tiny to large.
It uploads at about 13mbps since it didn’t get to the larger files yet.
Then at some point the disk is overwhelmed by small reads, app stops, later it says it was a temporary network failure. Then when the disk calms down the app resumes. Scans all of the remaining 479000 files and starts the same cycle.

In 24 hours it uploaded 150GB

Basically it keeps stopping and resuming.

Connect the usb drive to your Nextcloud server and copy the files over, then run an external scan.
Hopefully it is at least USB 3.0 because usb is slow, slow.
Even better, take the disk out of the enclosure and install in your server to copy the files over sata.

2nd best option is to use Syncthing or rsync, then run an external scan. Still limited by the usb connection, but at least you’ll have the ability to auto-resume, delta sync, and superior transfer protocols for moving all those files.

Sounds normal for USB 2.0 transfer
Or, could be your Nextcloud is hosted on microsd.
Hardware is definitely a concern.

What are your actual server specs? What kind of hardware?

  • CPU type
    • arm64 or x86-64
    • are you using a Raspberry Pi? If so, it supplies only the bare minimum specs as of Model 4.
  • How much RAM
    • If less than 4gb ram by itself, that is horrible.
  • How fast is your network speed?
    • If only xx mb/s of speed, you are hurting.
    • Please do not run your server over wifi. Very very slow.
    • Upgrade your network to wired gigabit if you can. Will greatly increase speeds.
    • External networking speeds will be hampered by your actual ISP speeds.
  • What kind of disks?
    • Spinning sata is fair
    • USB 2.0 external spinning drive garbage
    • Micro-SD card garbage
    • eMMC is not good
    • USB 3.0 external drive is fair for light usage
    • nvme Good!
    • SSD Good!
  • How many users?
  • Are you using one of the office suites?
    • Collabora CODE, OnlyOffice, etc?
    • If so, less than an additional two cores and additional 4gb ram is horrible.
  • Are you running other services on the same machine?
    • Matrix, Jellyfin, Jitsi, ?
    • If yes, you’ll need even more ram, etc.

I think upload is not as fast as download. But 10% seems not ok. But I would sometimes like to see more performance. It feels like Nextcloud is slower than other cloud providers.

As a workaround ok, nevertheless I wish for a faster Nextcloud.

Are there performance comparisons between other clouds and Nextcloud as well as between Nextcloud and WebDAV native?

Have you considered using “Dropbox integration” app to directly pull data to your NC server? (I never used it myself, but it might be a viable option)

From the connection perspective, if your Windows is on wireless, it was slow downloading. You saved it to a slow drive and now you are sending it over wireless back to the server. It will take some time… Find an eth cable and hardwire your computer to the network for the transfer (they are all assumptions, but you give us very little info).

As previously mentioned, if you have access, stick the USB in the server and pull the files directly then scan the files in.

Hey thanks I did try the migration app. It does work for a second but then also it freezes after a while. It actually times out for an hour before it restarts, but when it resumes there’s no log of what it synced, and I can’t trust it it moved all files. It would probably have taked a month.

I also used Synology cloud sync to synd Dropbox, it synced fast but about 24000 files didn’t have modification dates and the Nextcloud app wouldn’t upload them. So then I tried to script and edit mtimes but lots of files are long names and contain whitespaces so I gave up on that also.

I’m using rsync now to upload

Hey so I am actually using rsync now, I gave up on the app because it crashes for a long time before it resumes. It isn’t that the USB disk is slow, it’s fast, about 500mbps or so when it’s 100%, but the reads that also overload it aren’t transfering any data, they are just reads so the nextcloud app can have a list of file to transfer, it’s for that local database file that gets created I assume. I think this because the disk is 100% and the reads are in Kilobytes, no data is even being uploaded, app is just preparing to sync, it says file 0 out of, and it takes a very long time to reach 450000 and plus it crashes and starts from scratch.

The server is 2 physical cores. I’ve downloaded real files on it at 500mbps. Server isn’t here, I’m uploading from a Dell Optiples with USB 3.0 because Dropbox wouldn’t allow syncing to a SMB share.

Of course there are probably other factors too. For example when I count files on the server I do “find /folder | wc -l” and this takes one second but when I ran that command on the Desktop, and the Desktop is a Windows with Linux Subsystem, the disk got to a 100%, CPU core temp went to 54 degrees and the Fans sounded like a aeroplane taking off. Rsync just uses 1% of the disk locally. 12% of the CPU on the server, it’s around 30mbps but it did get to 102mbps once, it’s mostly 40 GB PDF files and then some 1mb photos etc.

So I guess the problem isn’t when the files are transfering as rsync does it well, it’s building the database and list of files. Disk gets overloaded the same with nextcloud app or with find command.

I can’t really tell what would happen if the disk wasn’t a USB disk, but I just ran the ‘find’ command on C:/Program Files and the regular HDD got to a 20%

So maybe that is a bottleneck but even though rsync should save me 6 days. At this rate it will complete in 3.7 days where as nextcloud app would take over 10 days.

The main reason I wanted to use the Nextcloud app was because some files had leading whitespaces so it let me know to rename them, and some characters aren’t allowed on nextcloud and then it would let me know.

I guess if rsync maxed out on 30mbps that’s it, it’s not really an nc desktop app issue. It could be the USB disk or just the fact that files are small and not much sequential writes happen like when you have movie files.

I’ll let you know how long it takes occ file:scan to scan 500000 files in 3 days or so :slight_smile:

Dropbox uses webdav also but I think they have figured out a way to download multiple files at once. Nextcloud does one file at the time, the desktop app I assume also uses webdav. Seafile sync client for example uses a completely different protocol, but I can’t use that here.

Maybe your Nextcloud is just poorly configured. You could try using an alternative Nextcloud. You can register for Shadow Drive here and get 20 GB for free and then use https://drive.shadow.tech with your credencials. I think it is really fast. Also you can get a test account from Nextcloud GmbH at try.nextcloud.com . If you have a link you can recreate new 60 minutes test accounts. :wink:

Sure, but it is also the hardware. 30/mb max when using rsync says it all.
Nextcloud (webdav) is fully dependent on your hardware and networking setup.
Also, fact remains that webdav performs poorly with large amounts of files.
It is an open protocol, which is universally supported while also performing on the slower side.
Developing improvements for it is laudable and difficult.

1 Like

I can’t actually test the hardware from my location truly. I thought it was nc app at first but it’s probably my location.

I can wget an openbsd iso file at 600mbps on that server. The server is two physical cores, not a VM I rented that’s limited by the disk speed.
I can sftp a file from a server in Cali all the way to WA state at 400mbps.
But I’m in Europe and all I get is at most 100mbps rsync, average 30mpbs.
So I don’t actually know the real speed.
My latency from here is about 160ms, so Rsync is sometimes at 16 and sometimes at 80mbps.

Rsync is still 3 to 10 times faster than the nc desktop app.