Nextcloud performance expectations

Hello Nextcloud people!

I want to start by saying thanks for making this great open source tool. Whether or not I can end up using it I appreciate what you are doing!

I have around 100 GB of data in 1,000,000 files. Is it possible this could be sync’d in 24 hours with a regular PC as a server? Or is that outside the scope of nextcloud?

Thank you

Old text removed to make the question clearer
I am hitting some performance issues. I have searched a little bit for info about this, and I noticed most of the posts I found are quite old, 2016 / 17 etc. So before I get to my problem, first I want to check if what I am aiming for is sensible. I have around 100 GB of data in 1,000,000 files. For the server I have a computer with specifications around a “normal desktop” ie. not a raspberry PI, not a multi cpu server / cluster. It is 16gb ram for example. The server is dedicated for Nextcloud it runs nothing else. Is there any way I could get “reasonable” sync times, for example 1 day?

As a P.S.: It would be really cool to post some very basic stats up. For example maybe a page that says "
Performance examples:
Tested performance on raspberry Pi: … ,
Tested performance on [standardard server, maybe EC2 or whatever]: … ,
Highest ever tested performance : …
"
I know you have so much work already. So I am not expecting that to be done. But it is just a suggestion in case it’s helpful! And if you did that already but I couldn’t find it, I am sorry.

Thank you so much for reading.

You will have to identify your performance bottleneck. You can get some basics under Settings -> Monitor. If the issue isn’t clear then you may need to hit the console and use top or vmstat. Of course a bottleneck outside the box is also a possibility.

Hello Karl

Thank you for your reply. In this question I am not trying to improve the performance. I am just asking has this kind of performance ever been achieved? (Is it a reasonable expectation?)

If it 10x the highest known file upload rate ever achieved on Nextcloud for example I will use something else. But if it is within commonly achieved rates then I will of course diagnose the problem :slight_smile:

For perfomance, the most critical is probably RAM. I run it in a virtual machine with 6 GB of RAM (if more is needed, I can increase this) and my user has > 90 GB in 20k files. Syncing works great so far on different clients, even shared over federated sharing. However, some code files are a bit tricky, also password containers, sometimes it creates conflicts. I normally don’t sync everything on each client, it’s just certain folders. And I am not alone, others are using it as well.
Don’t forget to use redis and do a proper configuration of your database cache.

A lot of very small files don’t sync very fast. In general, I think it should be doable, it could be better to not sync everything at once and add folders slowly.

I just found again a topic from some time ago (ownCloud 9), I manged to upload more than 1000 files / min (very small files):

On a raspberry it was less. Well this was all in oC 9, development got on, hopefully it is even a bit better. So 24h is probably a good guess.

You could imagine more improvements, like joining smaller files and reduce the number of SQL queries but such improvements have not been implemented in the client (yet). Unfortunately, the client can’t recognize if the files are present and identical to the server so it has to be loaded through the client-server-connection at least once (if not you could just copy it there, and use direct and faster file transfers like scp).

And it’s not only about the server as well. There are a lot of nuts and bolts, which are involved in such an assumption. I do have an 2015 model MacBookPro, which for the live of me doesn’t manage to achieve higher download rates then 20 MB/s on our company network. A newer one achieves 90+ MB at the very same configuration.

Your main goal will be to get your database performant enough, since 100GB are not the real issue here. So be prepared to stuff your server pc with plenty of RAM. The thing is… you will only be able to see, what you will have to tweak unless you put load on the whole system - anything before this is reading tea leaves.

Thank you tflidd

That is very encouraging to hear. I have a heavy requirement, so 24 hours is ok if I can achieve that. I am glad to know the best possible is a month or something, that would be too much.

I actually did look into those kinds of suggestions you mentioned: can WebDav support bulk uploads, can Nextcloud run FTP, or can I initialise the sync externally etc.

The performance I have is far far below what you described. So I think something is wrong. Currently I have Nextcloud set up through Ubuntu Snap. I think that has main things like Reddis, but I think it’s not the best way to do it for performance (I could be wrong). I think it’s maybe a bit harder to configure too. I am going to try out the nextcloud VM and see if I can get improvements https://github.com/nextcloud/vm .

There’s quite a comprehensive chapter on how to tune your NC instance in the docs. I’d suggest to go through that as a first step.

may be you should do the initial upload with a zip/tgz file and unpack on os level.
or use rsync/winscp?

after you uploaded all files you run occ files:scan --all and go to bed.

problem: if you connect a desktop client and sync everything will be downloaded. that may take a while.

if you want to test this and you’re familiar with cloud-init/ec2/digitalocean/scaleway have a look at this:

you should have a test server in 10 minutes. :wink:

than git clone the linux kernel several times and you have a lot of test files. could be done also with ansible or in bash for-loop.

Thanks budy. If I can achieve 20 MB/s including my small files that will be a dream. I have 16GB ram on the server, and it is doing nothing else. I think that’s ok? Ram and CPU usage on the server are very low. Disk access seems quite low. Since I know now that it can achieve 1000 files / min I am happy to do some proper diagnostics now :slight_smile: I think that is the next step.

Yes - try to get a baseline, performance-wise, for your initial setup. Then go through the tuning section of the docs. Usually, there’re two bottlenecks:

  1. the database: if running MySQL, use the msqltuner.pl after a day or two of regular use
  2. the filesystem: lots of small files are usually bad with most filesystems, lots of small files need the lowest latency regarding file access you can safely manage. The more drives you dedicate to this issue, the more performance you will achieve. If this is really cruical on a day-to-day basis, invest in some SSD or NVMe storage.

Hi Reiner. In terms uploading I can just walk across with the files to my server cupboard :slight_smile: But yes it is the download that I am concerned about also.

But your comment about zips gave me a great (or obvious really) idea. Why don’t I zip up my directories and test the speed with large files first. Then I will know if it is data size or file numbers that is the problem. So thank you for the help!

Btw, I have used @Reiner_Nippes install for my CentOS box at work and it really works very well… if you want to a test instance up fast, try that - maybe even in a VM.

…it will be your small files for sure…! :wink:

Thanks budy. I am running on SSDs. If I can find out what the problem is I will try to let you know.

And thanks about that installation script @budy and @Reiner_Nippes

You know, what the funny thing really is? Up until my filesystem caches are exhausted, uploading is always way faster than downloading. This is due to the fact that at upload time, the OS will happily aggregate all new data and try to flush it out to the storage in big chunks, while it has no choice when it comes to random reads…

i can give you the answer to that: using one large (centos.iso) file you achieve wire speed. (100mb/s on gb links) It’s the database access for each file that will kill you. and than you my want to use this: https://github.com/BMDan/tuning-primer.sh or the postgres equivalent.

(you may find my posting in the forum from last year about some tests @aws.)

I decided to dump my entire iPhone camera roll this morning with the Nextcloud app. It was about 550 photos (had not been cleaned out in a little bit). It took maybe 15 minutes, and the sync client on my laptop kept pace. I was pleasantly surprised by the sync rate. It outperformed other platforms by my observation.

My home instance is still pretty small. I run it in Docker with an Apache reverse proxy on the host in an ESXi virtual machine that has one CPU core and 4 GB RAM. I have plans to move it to much better hardware in a few weeks, so I expect even that good performance to improve.

There isn’t an end-all performance metric anyone can give you because there are many highly subjective aspects to it. Hardware, OS, configuration, network, etc. But, I will say that by my own unscientific observation it appears to outpace Google Drive and Dropbox, at least across my LAN. Syncing one million files is going to take some time on any platform.

In the link github issue, you’ll see different performance figures and I also mentioned what I changed. A proper configuration makes a huge difference, and this configuration depends a lot on your system. I didn’t use a snap or docker setup, not sure what’s preconfigured there and what you can change. I used a default debian/ubuntu system.

Do run some tests with very small files in the sync folder: https://github.com/owncloud/core/issues/11588#issuecomment-169728089

As reference, I used the down-/upload speed via scp to make sure there are no network issues.

if you use my playbook and want to fine tune everything.

look into the roles folder. there you find prep_something. e.g. you want to change the nginx settings it’s prep_nginx. and then search the files and/or template folder for config files. in default and vars folders are normally variable definitions. e.g. prep_mariadb/default/main.yml -> mariadb_version: 10.2

grafik

i think it’s more or less easy to find all of these files.

@tflidd using a docker based system with your own config is also easy. you create the config files on the host (line 4-18) and map them into the docker container (30-32).

how to find the files for tuning? It’s normally in the docs or dockerfiles -> https://hub.docker.com/_/nginx (Complex configuration)