Very slow sync (for small files)

I have connected my Raspberry PI and my MAC to a GbE switch port (cabled ethernet). So I have the fastest connection I can have at home. Still it takes a long time to sync files. Yesterday evening I left it at 82MB synced, this morning I woke up and it had done 150MB.

I’m a developer and I noticed it is currently syncing the folder in which I keep my source code for the applications I develop. hence I have lots of very small files (<50KB) as part of an application. Could this be a reason?

In total I have about 50GB to sync, fair to say that the first 40GB went quite fast.It’s only very recently it gets so slow hence I believe it has to do with the many small files. However not sure…

Any way to speed up things?

This issue comes up every now and then.

I am pretty sure if you look at the reason for that slow sync you will see that it is your db, more specifically your HDD working at 100% constantly.
The only workaround I found is to set innodb_flush_log_at_trx_commit = 2 which can result in data loss on blackouts and stuff like that. Never caused a problem for me though.

While MySQL might have some impact on the synchronization speed, the real and underlying reason for the small sync speed is just the braindead design being used by the sync client. This hasn’t changed since years.

Good synchronization services normally first bake one file, which is able to saturate the available upload bandwidth and then being extracted again on the server.

The official sync client of Nextcloud (also formerly Owncloud) only uploads one file after another using HTTP PUT and that’s it. Once some people suggested uploading multiple files at ones, but it seems that’s not enabled by default and this might cause issues with big files.

So this means in the end that syncing is ok if you have

  • few files or
  • just upload a photo now and then or
  • you’ve got plenty of time and time doesn’t matter to you.

It you though really happen to need synchronize thousands of small files at once, then you better take your time or start looking for another tool to do it.

By the way, the Ethernet of the Raspberry Pi is no real Ethernet adapter, but just USB-to-Ethernet and therefore will never be able to saturate 1 Gbit.

If you really need to be able to sync many small files as a developer you should really start using tools being made for this purpose from developers for developers, namely Git. Even Microsoft uses it now internally for the development of Windows.

1 Like

I have the same issue. A lot of directories and files, which are actually summing on a lot of data.
But wait… a lot? 5 to 50 Gb?

Few files and a photo now and then, makes the concept of nextcloud useless. I can do it by hand.

I think there is a design issue.

For example my newly created sync is blocking all other pre-existing syncs.
My newly large created sync starts over from the beginning to check every single folder for changes.

I know this is an old topic, but we have also new ones on the same line.
And apparently it is a not solved topic.

What about push notifications on file changes?
What about doing it in the background? in parallel? (mac might fire on CPU and FAN, but it does it already on its own with the md* services)

What about scheduling?

Or for example push immediately the last changed file, and delay large syncs.

Right now my desktop client is listing every single folder I have in a sync:
“checking for changes in XXXX”

Sorry for the rant, but I would really like to have nextcloud to be a working solution.

Thanks & best

1 Like

Are there any plans to improve this design? Uploading files one by one is a significant drawback to using this over one of the non-free solutions to storage. With the costs (time + $) of self hosting, there don’t seem to be many benefits other than ideological to use nextcloud/owncloud.

1 Like

Afaik, the Nextcloud client is currently revised and a new sync mechanism is being implemented. You will find further information here:

That’s a nice feature to have, but you lose out on having those extra backups lying around. I don’t like the tone of the article that makes it sound like they might abandon sync because a thin-client is “fundamentally the superior approach”. If my server dies, I can rest easy knowing all of the files are still safe on my computer. Or my other computer. Or the other one.

I’d rather have a series of clients that could handle diverse kinds of loads, with the kind of optimizations CaptainProton mentions. Can we make the fundamentals work? Although it is good to see some efforts happening on the client side.

I’m really wondering if I am missing the point here. The article linked above shows a screenshot where you can see the option to make files/ folders offline available:

So when the server dies, the files are still on your devices where you made them offline available.

The point they are making in the article is, that you don’t necessarily need to store all files also on your devices and consume disk space. You are allowed easy access to files and folders without disk space consumption if you want to, but you can still decide to have the urgently needed files on all the devices you want.

So concluding, you now get a choice you didn’t have before.

Before: sync or don’t sync. If not synced, use another solution/ software/ mechanism to access the files.
Now: online and offline files all available and accessible at one place (virtual drive). Make files offline available if desired. This brings the NC client closer to the OneDrive solution which better integrates into Windows compared to the old NC desktop client.

I’m not saying this is the absolutely perfect solution for everybody, I just want to point out, why I don’t understand the worry about a dead server and probably not available files.
I can relate to complains that you always see the full folder structure from the server in the virtual drive and can’t choose which folders precisely to see in your Explorer. Maybe there are solutions for that too, however.

Time to necropost since I’m now performing a sync on a large number of small files.

Can’t there be some sort of batching performed that groups a bunch of small files into one HTTP request, thus cutting down on the huge overhead of individual requests for each file? I’m working with a whole slew of <1KB files that need to get synced and it’s taking ages. I can only imagine that there would be similar issues for any size of files up to about 5MB… It’s not like there aren’t some tiny, very good quality zip libs that can be included in the client to make grouping easier, with the additional benefit of cutting down on transferred file size due to compression. Or heck, just pile a batch of files into a tarball and send 'em that way.

I am in the same boat here. I’m moving from Resilio Sync (formerly BitTorrent Sync) to Nextcloud. Resilio/BitTorrent never had issues syncing my git repos (yes, they are git repos and I’m also syncing them with my server) but now Nextcloud is having a heck of a time syncing a few hundred MB. While I understand syncing one file at a time for larger files, why not just upload chunks of a certain size to chunk the bigger files and group the smaller files, or just upload several smaller files at a time that are under 1MB. Even uploading large files was much slower with Nextcloud than Resilio.

This worked for me and immediately improved the performance.

I also have similar issue, until now, still not get the solution to increase the upload speed performance. :sweat_smile:

Proper fix is to add to bottom of php.ini for the FPM & CLI & CGI to add in:

memory_limit = 10G
post_max_size = 25G
upload_max_filesize= 25G

Then increase chunksize in /var/www/nextcloud/config/config.php:

‘chunkSize’ => ‘5120MB’,

Then you need to restart php-fpm:

systemctl restart php7.4-fpm

Simply replace the 7.4 with your version number, can be found with php -v or php -info

So far desktop sync sped right up. Still testing to see if Android app finally can finish its uploads.

I also added the preview image cacher plugin, ran the initial run, and added the cronjob. Next I will offload my MySQL, Redis, and Clamscan to a second raspi. That should offload lots of work from the pi running NextCloud

I wish there was a plugin/way to add a secondary pi to offload the primary pi’s resources for NextCloud. So you can add more processing power from another pi. It would be cool to have a way to set up master/slave next cloud servers, so you can have multiple servers serving nextcloud.