WebDAV Errors & Slow after upgrade 20 --> 24

Nextcloud version (eg, 20.0.5): 24.0.8
Operating system and version (eg, Ubuntu 20.04): Ubuntu 20.04
Apache or nginx version (eg, Apache 2.4.25): Apache 2.4.41-4ubuntu3.12
PHP version (eg, 7.4): 8.0.26

We upgraded from Nextcloud 20 to 24 over Thanksgiving weekend, in an attempt to address an annoyance with the PDF Viewer. After the upgrade, WebDAV is horrendously slow to execute existence checks on folders, and slow to complete uploads. Android WebDAV clients receive 409 errors. Initial upgrade was to 24.0.7, upgrade to 24.0.8 did not correct either issue. Recommended steps were completed after each upgrade.

Are there any known regressions to WebDAV functionality from 20 to 24?

Is this the first time you’ve seen this error? (Y/N): Y

Most update gradually… so hard to say.

That would be one part especially on the database indexes. Normally they try to improve performances by that. Perhaps at some point they rely more on some caching mechanisms, so if you don’t use redis, this could be an idea.

You didn’t say if you found errors in the logfiles. There could be apps that are not updated properly and blocking things in the background (you can also try to disable certain apps if it gets better). If that doesn’t help, you can check for i/o and cpu-performance what process uses all the resources, or if the system runs at its limits (or the disk is full, slow, …)

Errors like this seem to be the most likely candidate:
{“reqId”:“6SJZFJ3ZFxFPyHH1ykDs”,“level”:3,“time”:“2023-01-05T13:42:58+00:00”,“remoteAddr”:“10.0.116.165”,“user”:“JavaEERW”,“app”:“no app in context”,“method”:“PUT”,“url”:“/remote.php/dav/files/javaeerw/Work%20Orders/310000/314000/314800/314889/Photos//20230105-0610065.jpg”,“message”:“Expected filesize of 3419932 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 2601678 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.”,“userAgent”:“okhttp/4.9.0”,“version”:“24.0.8.2”,“exception”:{“Exception”:“Sabre\DAV\Exception\BadRequest”,“Message”:“Expected filesize of 3419932 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 2601678 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.”,“Code”:0,“Trace”:[{“file”:“/var/www/html/apps/dav/lib/Connector/Sabre/Directory.php”,“line”:164,“function”:“put”,“class”:“OCA\DAV\Connector\Sabre\File”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php”,“line”:1098,“function”:“createFile”,“class”:“OCA\DAV\Connector\Sabre\Directory”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/dav/lib/DAV/CorePlugin.php”,“line”:504,“function”:“createFile”,“class”:“Sabre\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/event/lib/WildcardEmitterTrait.php”,“line”:89,“function”:“httpPut”,“class”:“Sabre\DAV\CorePlugin”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php”,“line”:472,“function”:“emit”,“class”:“Sabre\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php”,“line”:253,“function”:“invokeMethod”,“class”:“Sabre\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php”,“line”:321,“function”:“start”,“class”:“Sabre\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/apps/dav/lib/Server.php”,“line”:358,“function”:“exec”,“class”:“Sabre\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/apps/dav/appinfo/v2/remote.php”,“line”:35,“function”:“exec”,“class”:“OCA\DAV\Server”,“type”:“->”},{“file”:“/var/www/html/remote.php”,“line”:170,“args”:[“/var/www/html/apps/dav/appinfo/v2/remote.php”],“function”:“require_once”}],“File”:“/var/www/html/apps/dav/lib/Connector/Sabre/File.php”,“Line”:297,“CustomMessage”:“–”},“id”:“63b6f7e832dc6”}
The problem is that this doesn’t help the diagnosis, as it appears that some part of the system is interrupting the data flow. It doesn’t appear to be the network, as the clients are getting “socket closed,” and the above mentioned 409 errors.

Here’s my config file:

<?php $CONFIG = array ( 'overwrite.cli.url' => 'http://nextcloud.performair.local', 'objectstore' => array ( 'class' => '\\OC\\Files\\ObjectStore\\S3', 'arguments' => array ( 'bucket' => 'nextcloud', 'autocreate' => false, 'key' => , 'secret' => , 'hostname' => , 'use_ssl' => false, 'use_path_style' => true, ), ), 'lost_password_link' => 'disabled', 'memcache.local' => '\\OC\\Memcache\\Redis', 'memcache.distributed' => '\\OC\\Memcache\\Redis', 'filelocking.enabled' => true, 'memcache.locking' => '\\OC\\Memcache\\Redis', 'redis' => array ( 'host' => '/var/run/redis/redis-server.sock', 'port' => 0, ), 'mail_smtpmode' => 'smtp', 'mail_sendmailmode' => 'smtp', 'mail_domain' => 'performair.local', 'mail_smtphost' => , 'mail_smtpport' => '25', 'mail_from_address' => 'nextcloud', 'instanceid' => , 'passwordsalt' => , 'secret' => , 'trusted_domains' => array ( 0 => 'nextcloud.performair.local', 1 => 'nextcloud.performair.us', 2 => '10.0.160.206', ), 'datadirectory' => '/var/nextcloud/data', 'dbtype' => 'mysql', 'version' => '24.0.8.2', 'dbname' => 'nextcloud', 'dbhost' => , 'dbport' => '', 'dbtableprefix' => 'oc_', 'dbuser' => 'nextcloud', 'dbpassword' => , 'installed' => true, 'maintenance' => false, 'ldapIgnoreNamingRules' => false, 'ldapProviderFactory' => 'OCA\\User_LDAP\\LDAPProviderFactory', 'has_rebuilt_cache' => true, 'versions_retention_obligation' => 'disabled', 'loglevel' => 3, 'mysql.utf8mb4' => true, 'encryption.legacy_format_support' => false, 'encryption.key_storage_migrated' => false, 'default_phone_region' => 'US', ); We use Ceph, through a Ceph S3 gateway for storage.

I have confirmed that Chrome on the handhelds in use can access the Nextcloud website, and can successfully log in.
Increasing the timeout of the WebDAV client in use allows some functions to complete, and / or generate appropriate errors (previous errors were SocketTimeout, and SocketClosed).

The server isn’t heavily loaded, normal load doesn’t exceed 20% CPU.
Root partition use is under 10%
Ceph cluster usage is under 25%
Memory use is high (around 70%), but swap isn’t being used.