Database gets stuck on long SELECT JOIN queries involving oc_filecache

Nextcloud version : 24.0.4.1
Operating system and version : Ubuntu 22.04.1
Apache or nginx version : 2.4.52
PHP version: 8.1
MariaDB version: 10.6.7

The issue you are facing:
Ever since I updated to Ubuntu 22.04 and PHP 8.1, I have been getting random locks on my database involving the oc_filecache table

There doesn’t seem to be a discernable pattern to when it occurs - Last one occured at 6am when I was asleep. I am the only user of this instance

If a long running query happens, the server will stop responding, the mariadb service won’t stop with this error, I have to kill -9 the mariadb process to restart it:

Warning] /usr/sbin/mariadbd: Thread 12025 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12024 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12023 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12022 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12021 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12020 (user : 'nextcloud') did not exit
Warning] /usr/sbin/mariadbd: Thread 12018 (user : 'nextcloud') did not exit

So far the two queries that have caused this issue are below:

SELECT `user_id` FROM `oc_filecache` `f` INNER JOIN `oc_mounts` `m` ON `storage_id` = `storage` WHERE (`size` < 0) AND (`parent` > -1) LIMIT 1
SELECT `a`.`name` FROM `oc_filecache` `a` LEFT JOIN `oc_filecache` `b` ON `a`.`name` = `b`.`fileid`

Weirdly, if I run the queries manually after restarting they seem to execute in reasonable time - a few seconds

When the long running query is active, a ton of other new queries
SELECT * FROM oc_appconfig occur as well and get stuck as well even though they don’t query the oc_filecache table

I already ran occ files:cleanup and it did nothing

Is this the first time you’ve seen this error? (Y/N): N

Steps to replicate it:

  1. Leave nextcloud running for a while with no input
  2. Possibly have a large oc_filecache (2096214 rows currently)
  3. Notice that sync clients are showing offline and web UI is not responding

The output of your Nextcloud log in Admin > Logging Only a DNS error before the lock occurred and sql server unavailable during the restart attempt:

Error	cron	Doctrine\DBAL\Exception\ConnectionLost: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away		2022-09-13T08:27:56+0800
Error	core	Doctrine\DBAL\Exception\ConnectionLost: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away		2022-09-13T08:09:29+0800
Error	PHP	Error: dns_get_record(): A temporary server error occurred. at /var/www/nextcloud/lib/private/Http/Client/DnsPinMiddleware.php#83		2022-09-13T05:47:06+0800

The output of your config.php file in /path/to/nextcloud - Config I believe are relevant:

  'memcache.local' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'filelocking.enabled' => 'true',
  'redis' =>
  array (
    'host' => '/run/redis/redis.sock',
    'port' => 0,
    'dbindex' => 0,
    'timeout' => 1.5,
  ),
  'dbtype' => 'mysql',
  'version' => '24.0.4.1',
  'dbname' => 'nextcloud',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'nextcloud',

The output of your Apache/nginx/system log in /var/log/____ Only errors from when I restart the database and nextcloud can’t connect:

[Tue Sep 13 08:37:28.958258 2022] [proxy_fcgi:error] [pid 46564:tid 140377172469312] [client 192.168.1.1:54914] AH01071: Got error 'PHP message: PHP Fatal error:  Uncaught Doctrine\\DBAL\\Exception: Failed to connect to the database: An exception occurred in the driver: SQLSTATE[HY000] [2002] No such file or directory in /var/www/nextcloud/lib/private/DB/Connection.php:139\nStack trace:\n#0 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php(1519): OC\\DB\\Connection->connect()\n#1 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php(1041): Doctrine\\DBAL\\Connection->getWrappedConnection()\n#2 /var/www/nextcloud/lib/private/DB/Connection.php(261): Doctrine\\DBAL\\Connection->executeQuery()\n#3 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php(345): OC\\DB\\Connection->executeQuery()\n#4 /var/www/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php(281): Doctrine\\DBAL\\Query\\QueryBuilder->execute()\n#5 /var/www/nextcloud/lib/private/AppConfig.php(411): OC\\DB\\QueryBuilder\\QueryBuilder->execute()\n#6 /var/www/nextcloud/lib/private/AppConfig.php(176): OC\\AppConfig->loadConfigValues()\n#7 /var/www/nextcloud/lib/p...'
[Tue Sep 13 08:38:31.694546 2022] [proxy_fcgi:error] [pid 46563:tid 140377574262336] [client 192.168.1.1:54918] AH01071: Got error 'PHP message: PHP Fatal error:  Uncaught Doctrine\\DBAL\\Exception: Failed to connect to the database: An exception occurred in the driver: SQLSTATE[HY000] [2002] No such file or directory in /var/www/nextcloud/lib/private/DB/Connection.php:139\nStack trace:\n#0 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php(1519): OC\\DB\\Connection->connect()\n#1 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php(1041): Doctrine\\DBAL\\Connection->getWrappedConnection()\n#2 /var/www/nextcloud/lib/private/DB/Connection.php(261): Doctrine\\DBAL\\Connection->executeQuery()\n#3 /var/www/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php(345): OC\\DB\\Connection->executeQuery()\n#4 /var/www/nextcloud/lib/private/DB/QueryBuilder/QueryBuilder.php(281): Doctrine\\DBAL\\Query\\QueryBuilder->execute()\n#5 /var/www/nextcloud/lib/private/AppConfig.php(411): OC\\DB\\QueryBuilder\\QueryBuilder->execute()\n#6 /var/www/nextcloud/lib/private/AppConfig.php(176): OC\\AppConfig->loadConfigValues()\n#7 /var/www/nextcloud/lib/p...'

Output errors in nextcloud.log in /var/www/ or as admin user in top right menu, filtering for errors. Use a pastebin service if necessary.

Same errors as nextcloud UI Log

It happens to me as well :frowning: The only workaround I’ve found is to restart the MariaDB container. If I kill the SQL process manually, it gets spawned again.
I’ve also tried to run oc files:cleanup just in case but it said 0 orphaned deleted.
Any clues?

I ended up adding some tune parameters to the MariaDB container as per Nextcloud tuning / Haefelfinger - Techblog

[mysqld]
innodb_buffer_pool_size = 1G
innodb_log_file_size    = 256M

read_rnd_buffer_size    = 4M
sort_buffer_size        = 4M

So far, didn’t see the mariadb process in the top output again.