Nextcloud slowing down on folders with lots of files

Nextcloud version (eg, 12.0.2): 16.0.1
Operating system and version (eg, Ubuntu 17.04): Ubuntu 18.04.2 LTS
Apache or nginx version (eg, Apache 2.4.25): nginx 1.14.0
PHP version (eg, 7.1): 7.2

The issue you are facing:
I installed a fresh copy of nextcloud on a high performance Hetzner dedicated server (i7, 64gb ram, 2TB nvme ssd), and I’m seeing major slowdowns in performance whenever I’m working in a folder with lots of files (about 5000 in a folder, mostly images). The Image preview gets really slow, takes a few seconds to load, and loading the folder as a whole takes a long while as well.
I noticed that in these slowdown situations, nextcloud always downloads a 4.4 Megabyte XML file, which I assume lists all the files and their properties. But why does it download the whole thing all at once? And even weirder, why does it do so whenever I click on an image to see the full version?

Steps to replicate it:

  1. Have a ton of images/files in your folder

The output of your Nextcloud log in Admin > Logging:

2019-06-26T12:02:06+0200
Error	jsresourceloader	Could not find resource core/vendor/marked/marked.min.js to load	
2019-06-26T08:12:12+0200
Error	jsresourceloader	Could not find resource core/vendor/marked/marked.min.js to load	
2019-06-26T08:12:06+0200
Error	jsresourceloader	Could not find resource core/vendor/marked/marked.min.js to load	
2019-06-26T08:12:02+0200
Error	jsresourceloader	Could not find resource core/vendor/marked/marked.min.js to load	
2019-06-26T08:12:01+0200
Warning	core	InvalidArgumentException: E-Mail zum Zurücksetzen kann nicht versendet werden. Bitte stellen Sie sicher, dass Ihr Benutzername richtig ist.	
2019-06-26T03:18:38+0200
Warning	core	Login failed: 'abcdef' (Remote IP: 'x.xxx.xx.xxx')	

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):


<?php
$CONFIG = array (
  'instanceid' => 'xxxxxxxxxxx',
  'passwordsalt' => xxxxxxxxxxxxxxxxxxxxxxxxxxx',
  'secret' => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  'trusted_domains' =>
  array (
    0 => 'xxxx.xxx',
  ),
  'datadirectory' => '/opt/nextcloud',
  'dbtype' => 'mysql',
  'version' => '16.0.1.1',
  'overwrite.cli.url' => 'https://xxxx.xxx',
  'dbname' => 'xxxxxxxx',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => 'nextcloud',
  'dbpassword' => 'xxxxxxxxxxxxxx',
  'memcache.local' => '\OC\Memcache\APCu',
  'memcache.distributed' => '\OC\Memcache\Redis',
  'memcache.locking' => '\OC\Memcache\Redis',
  'redis' =>
  array (
    'host' => '/run/redis/redis.sock',
    'port' => 0,
    'timeout' => 0
  ),
  'installed' => true,
  'maintenance' => false,
  'mail_smtpmode' => 'smtp',
  'mail_smtphost' => 'smtp.xxxxxxxx.com',
  'mail_sendmailmode' => 'smtp',
  'mail_smtpport' => '123',
  'mail_from_address' => 'xxxx',
  'mail_domain' => 'xxxxxxxxxxx.xx',
  'mail_smtpsecure' => 'tls',
  'mail_smtpauth' => 1,
  'mail_smtpauthtype' => 'LOGIN',
  'mail_smtpname' => 'xxxx@xxxxxxxxxxx.xx',
  'mail_smtppassword' => 'xxxxxxxxxxx',
);

The output of your Apache/nginx/system log in /var/log/____:
(this happens when I click on an image)

xx.xxx.xxx.xxx - - [26/Jun/2019:20:51:16 +0200] "PROPFIND /remote.php/dav/files/Moka/ShareX HTTP/2.0" 207 4804772 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
xx.xxx.xxx.xxx - - [26/Jun/2019:20:51:17 +0200] "GET /core/preview?fileId=114123&x=1278&y=1312&a=true HTTP/2.0" 200 425669 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
xx.xxx.xxx.xxx - - [26/Jun/2019:20:51:17 +0200] "GET /core/preview?fileId=114092&x=1278&y=1312&a=true HTTP/2.0" 200 2123 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
xx.xxx.xxx.xxx - - [26/Jun/2019:20:51:17 +0200] "GET /core/preview?fileId=114096&x=1278&y=1312&a=true HTTP/2.0" 200 25096 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"

Obviously, parsing a 5MB xml file takes a long time. So why does it have to be so large in the first place?

this is the kind of use-case developer hate. (note, i’m not a dev, just an admin with a couple of hundred users)

First: 5000 lots of anything are going to take time. Nextcloud will make thumbnails, so your 5000 files turn into many, many file operations (check if the file is still there, check if it’s changed, check if we have a thumbnail,make one if we dont … times 5000!) even a high end nVME ssd is going to be hard pressed to do more than 100k IOPS

The XML file will be large … because XML is stupid. And it’s slow to parse because it’s large, and to ‘correctly’ read an XML file, you need all of it before you can even start parsing it. a JSON file will be smaller and faster - but will probably require a nontrivial amount of code changes to implement.

so, to answer your questions: you need the file on every access, so you get the actual file list (if the file list changes and the XML files does not, you don’t see the changes)
You need thew whole file, because it’s XML and you need the whole file.

i noticed one of your preview images is half a megabyte. At worst: 5000x0.5Mb == 2.5Gb … which is going to be slow, and huge, and unless your desktop has enough ram to deal with having that many previews of that size rendered in canvas, the bottleneck might not even be nextcloud.

The quickest (and frankly, probably best) solution is to make sub-folders and put your files in those.