I/O Problems after System Rebuild/Restore due to jdb2

Nextcloud version (eg, 20.0.5): 23.0.2
Operating system and version (eg, Ubuntu 20.04): Ubuntu 20.04
Apache or nginx version (eg, Apache 2.4.25): 2.4.52
PHP version (eg, 7.4): 7.4

The issue you are facing:
I had some physical problems with my bootdrive so I had to reinstall Ubuntu on a new drive.
I luckily had my documentation up to date so I was able to reinstall quickly and since i did backups of my mySQL database I was also able to restore the database without any dataloss (eg. calendar entrys or tasks) via https://docs.nextcloud.com/server/latest/admin_manual/maintenance/restore.html. So success there.
But i noticed that the system is responding pretty slowly compared to before. I DO have switched from an SSD to a HDD since it was available and performance was not a primary concern for my home installation. But it is so slow it even times out sometimes so clients (2 Desktops and 3 Android clients) get disconnected.
So i tried to dig a bit and i think this has sth. to do with database syncing:
iotop reports 99% usage by jdb2 for very long periods of time which seems kind of odd:

Total DISK READ:         0.00 B/s | Total DISK WRITE:      1435.99 K/s
TID to ionice: 
    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                              
    330 be/3 root        0.00 B/s    0.00 B/s  0.00 % 99.97 % [jbd2/sda2-8]
   1196 be/4 mysql       0.00 B/s    0.00 B/s  0.00 % 22.90 % mysqld [ib_io_wr-3]
   1204 be/4 mysql       0.00 B/s    0.00 B/s  0.00 % 14.94 % mysqld [ib_log_flush]
   1198 be/4 mysql       0.00 B/s  360.88 K/s  0.00 % 12.00 % mysqld [ib_pg_flush_co]
   1199 be/4 mysql       0.00 B/s  300.73 K/s  0.00 %  9.03 % mysqld [ib_pg_flush-1]
   1201 be/4 mysql       0.00 B/s  240.59 K/s  0.00 %  6.98 % mysqld [ib_pg_flush-3]
   1195 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  6.06 % mysqld [ib_io_wr-2]
   1206 be/4 mysql       0.00 B/s   67.66 K/s  0.00 %  5.48 % mysqld [ib_log_writer]
   3031 be/4 mysql       0.00 B/s  101.50 K/s  0.00 %  4.93 % mysqld [connection]
   3253 be/4 mysql       0.00 B/s   48.87 K/s  0.00 %  4.53 % mysqld [connection]
   1200 be/4 mysql       0.00 B/s  300.73 K/s  0.00 %  2.35 % mysqld [ib_pg_flush-2]
   3252 be/4 mysql       0.00 B/s   15.04 K/s  0.00 %  1.83 % mysqld [connection]

This only happens when apache2 is running and the only site i am serving is nextcloud. And nextcloud is also the only database that i am running.
What throws me off a bit (besides the 99%) is that there is no DISK READ but so much DISK WRITE. If syncing is done from memory (I have setup as advised in https://docs.nextcloud.com/server/latest/admin_manual/installation/server_tuning.html, so e.g. caching is activated) i do not get why there should be that much DISK WRITE when actually nothing changed on any client …
Also there is enough RAM (16GB currently, will pull out 8GB soon since not needed), as also shown by top:

Tasks: 176 total,   1 running, 175 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4,7 us,  1,2 sy,  0,0 ni, 66,4 id, 26,8 wa,  0,0 hi,  0,8 si,  0,0 st
MiB Mem :  15898,6 total,  10547,0 free,   1222,0 used,   4129,6 buff/cache
MiB Swap:   4096,0 total,   4096,0 free,      0,0 used.  14304,0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                        
   1162 mysql     20   0 3874700 716056  38960 S  10,6   4,4  21:50.36 mysqld                                         
   3542 root      20   0   20236  14564   7324 S   6,3   0,1  11:34.31 iotop                                          
   2029 www-data  20   0  366900  80452  58840 S   1,7   0,5   0:51.32 php-fpm7.4                                     
   4709 www-data  20   0  364724  78088  58700 S   1,3   0,5   0:19.20 php-fpm7.4                                     
    890 redis     20   0   52748   5664   3624 S   0,7   0,0   0:52.65 redis-server                                   
   1045 www-data  20   0 2724104  26752   9408 S   0,7   0,2   0:50.33 apache2                                        
   2031 www-data  20   0  370800  87776  62340 S   0,7   0,5   0:50.14 php-fpm7.4                                     
   3037 www-data  20   0  360672  75800  59592 S   0,7   0,5   0:43.71 php-fpm7.4                                     
   4732 www-data  20   0  366456  78860  57656 S   0,7   0,5   0:17.21 php-fpm7.4                                     
    330 root      20   0       0      0      0 D   0,3   0,0   1:03.15 jbd2/sda2-8                                    
   1736 media     20   0    9248   3884   3236 R   0,3   0,0   0:53.59 top                                            
   2033 www-data  20   0  364740  78208  59752 S   0,3   0,5   0:52.25 php-fpm7.4                                     
   3034 www-data  20   0  381204  96068  60264 S   0,3   0,6   0:42.51 php-fpm7.4                                     
   3041 www-data  20   0  343488  63576  50592 S   0,3   0,4   0:42.20 php-fpm7.4                                     
   3535 www-data  20   0  366604  80004  58824 S   0,3   0,5   0:38.30 php-fpm7.4                                     
   4733 www-data  20   0  366744  78624  57212 S   0,3   0,5   0:18.91 php-fpm7.4                                     
   4734 www-data  20   0  348592  70012  53148 S   0,3   0,4   0:18.17 php-fpm7.4     

i googled a bit and found this answer on serverfault.com: linux - High server load - [jbd2/md1-8] using 99.99% IO - Server Fault
So i enabled debug tracing and got the following output:

cat /sys/kernel/debug/tracing/trace_pipe
  ib_pg_flush_co-1179    [003] .... 76909.482568: ext4_sync_file_enter: dev 8,2 ino 17967725 parent 17967302 datasync 0 
     jbd2/sda2-8-330     [001] .... 76909.510310: jbd2_commit_flushing: dev 8,2 transaction 6838369 sync 0
      connection-107736  [003] .... 76909.510445: ext4_sync_file_enter: dev 8,2 ino 17964866 parent 17967302 datasync 1 
    ib_log_flush-1184    [003] .... 76909.512979: ext4_sync_file_enter: dev 8,2 ino 17967723 parent 17967302 datasync 0 
     jbd2/sda2-8-330     [001] .... 76909.581287: jbd2_commit_flushing: dev 8,2 transaction 6838370 sync 0
      ib_io_wr-3-1177    [003] .... 76909.581615: ext4_sync_file_enter: dev 8,2 ino 17967730 parent 17967302 datasync 0 
     jbd2/sda2-8-330     [002] .... 76909.622985: jbd2_commit_flushing: dev 8,2 transaction 6838371 sync 0
    ib_log_flush-1184    [001] .... 76909.623073: ext4_sync_file_enter: dev 8,2 ino 17967723 parent 17967302 datasync 0 
      connection-108343  [000] .... 76909.623334: ext4_sync_file_enter: dev 8,2 ino 17964866 parent 17967302 datasync 1 
    ib_log_flush-1184    [001] .... 76909.667207: ext4_sync_file_enter: dev 8,2 ino 17967723 parent 17967302 datasync 0 
     jbd2/sda2-8-330     [002] .... 76909.667207: jbd2_commit_flushing: dev 8,2 transaction 6838372 sync 0
      connection-107736  [000] .... 76909.708276: ext4_sync_file_enter: dev 8,2 ino 17964866 parent 17967302 datasync 1 
    ib_log_flush-1184    [000] .... 76909.708397: ext4_sync_file_enter: dev 8,2 ino 17967723 parent 17967302 datasync 0 
     jbd2/sda2-8-330     [002] .... 76909.708574: jbd2_commit_flushing: dev 8,2 transaction 6838373 sync 0

I am not sure what to do now. I do not even undertand what i did with the debug there tbh.

I thought about build up a new nextcloud DB, Add the users manually, restore calendar and contacts from backups using the WebGUI and then rescan the data directory for the users to restore the “real” data. And then reshare the folders/files and alike. Would that help? Any other suggestions?

Is this the first time you’ve seen this error? (Y/N): Y

Steps to replicate it:

  1. mess up your bootdrive
  2. reinstall nextcloud on new drive
  3. recover database as shown in https://docs.nextcloud.com/server/latest/admin_manual/maintenance/restore.html
  4. get high I/O from jdb2

The output of your Nextcloud log in Admin > Logging:

Fehler	PHP	Error: disk_free_space(): No such file or directory at /var/www/nextcloud/lib/private/Files/Storage/Local.php#381	
2022-03-10T09:17:19+0100
Fehler	PHP	Error: disk_free_space(): No such file or directory at /var/www/nextcloud/lib/private/Files/Storage/Local.php#381	
2022-03-10T09:17:19+0100
Fehler	PHP	Error: disk_free_space(): No such file or directory at /var/www/nextcloud/lib/private/Files/Storage/Local.php#381	
2022-03-10T09:16:22+0100
Fehler	PHP	Error: dns_get_record(): A temporary server error occurred. at /var/www/nextcloud/lib/private/Http/Client/DnsPinMiddleware.php#83	
2022-03-10T06:19:26+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 3696023 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T01:19:22+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 2513569 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T01:19:22+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 3696023 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T01:18:23+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 2513569 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T01:18:23+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 344796 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:49:22+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 175772 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:49:21+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 344796 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:48:49+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 175772 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:48:48+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 236349826 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 262144 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:22:45+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 334195014 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 311296 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:22:30+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 181701407 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 352256 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:22:30+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 334195014 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:22:21+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 181701407 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:22:20+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 334195014 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:21:42+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 181701407 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:21:42+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 181701407 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:21:41+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 181701407 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:21:07+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 50655641 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 344064 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:17:12+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 50655641 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 2613248 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:15:32+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 50655641 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:15:25+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 50655641 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 376832 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:13:57+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 50655641 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 49152 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:13:23+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 319507884 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 344064 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:09:58+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 319507884 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 466944 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:08:56+0100
Fehler	no app in context	Sabre\DAV\Exception\BadRequest: Expected filesize of 319507884 bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 16580608 bytes. Could either be a network problem on the sending side or a problem writing to the storage on the server side.	
2022-03-10T00:08:55+0100

(note: probably caused by my testing, after a reboot and waiting 4h no new erros were logged while the problem still persisted)

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):

cat /var/www/nextcloud/config/config.php 
<?php
$CONFIG = array (
  'instanceid' => '***',
  'passwordsalt' => '***',
  'secret' => '***',
  'trusted_domains' => 
  array (
    0 => '***.ddns.net',
  ),
  'datadirectory' => '/***/nextcloud',
  'dbtype' => 'mysql',
  'version' => '23.0.2.1',
  'overwrite.cli.url' => 'https://***.ddns.net',
  'htaccess.RewriteBase' => '/',
  'overwritehost' => '***.ddns.net',
  'default_language' => 'de',
  'default_phone_region' => 'DE',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'filelocking.enabled' => 'true',
  'memcache.distributed' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => 'localhost',
    'port' => 6379,
    'timeout' => 0,
    'dbindex' => 0,
  ),
  'loglevel' => '2',
  'logfile' => '/var/log/nextcloud.log',
  'dbname' => '***',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => '***',
  'dbpassword' => '***',
  'installed' => true,
  'maintenance' => false,
);

The output of your Apache/nginx/system log in /var/log/____:

cat /var/log/apache2/error.log
[Thu Mar 10 09:54:17.082584 2022] [proxy_fcgi:error] [pid 105342:tid 139834046785280] [client 192.168.2.13:55398] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 09:54:17.518283 2022] [mpm_event:notice] [pid 105331:tid 139834256768320] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 09:57:09.684708 2022] [mpm_event:notice] [pid 105608:tid 139820634340672] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 09:57:09.684738 2022] [core:notice] [pid 105608:tid 139820634340672] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 10:10:28.623686 2022] [proxy_fcgi:error] [pid 105610:tid 139820290074368] [client 192.168.2.6:54046] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:10:28.623723 2022] [proxy_fcgi:error] [pid 105610:tid 139820290074368] (104)Connection reset by peer: [client 192.168.2.6:54046] AH01075: Error dispatching request to : 
[Thu Mar 10 10:10:28.625634 2022] [proxy_fcgi:error] [pid 105609:tid 139820357216000] [client 192.168.2.28:57736] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:10:28.625663 2022] [proxy_fcgi:error] [pid 105609:tid 139820357216000] (104)Connection reset by peer: [client 192.168.2.28:57736] AH01075: Error dispatching request to : 
[Thu Mar 10 10:10:28.625785 2022] [proxy_fcgi:error] [pid 105610:tid 139820009502464] [client 192.168.2.12:48838] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:10:28.625794 2022] [proxy_fcgi:error] [pid 105610:tid 139820009502464] (104)Connection reset by peer: [client 192.168.2.12:48838] AH01075: Error dispatching request to : 
[Thu Mar 10 10:10:28.627587 2022] [proxy_fcgi:error] [pid 105609:tid 139820298467072] [client 192.168.2.28:57668] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:10:28.627619 2022] [proxy_fcgi:error] [pid 105609:tid 139820298467072] (104)Connection reset by peer: [client 192.168.2.28:57668] AH01075: Error dispatching request to : 
[Thu Mar 10 10:10:28.645197 2022] [proxy:error] [pid 105609:tid 139820298467072] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:10:28.645216 2022] [proxy_fcgi:error] [pid 105609:tid 139820298467072] [client 192.168.2.28:57668] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:10:28.650899 2022] [proxy:error] [pid 105609:tid 139820298467072] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:10:28.650913 2022] [proxy_fcgi:error] [pid 105609:tid 139820298467072] [client 192.168.2.28:57736] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:10:28.839540 2022] [mpm_event:notice] [pid 105608:tid 139820634340672] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 10:10:32.198739 2022] [core:warn] [pid 105608:tid 139820634340672] AH00045: child process 105610 still did not exit, sending a SIGTERM
[Thu Mar 10 10:10:34.200980 2022] [core:warn] [pid 105608:tid 139820634340672] AH00045: child process 105610 still did not exit, sending a SIGTERM
[Thu Mar 10 10:10:36.202750 2022] [core:warn] [pid 105608:tid 139820634340672] AH00045: child process 105610 still did not exit, sending a SIGTERM
[Thu Mar 10 10:10:38.204635 2022] [core:error] [pid 105608:tid 139820634340672] AH00046: child process 105610 still did not exit, sending a SIGKILL
[Thu Mar 10 10:10:39.261101 2022] [mpm_event:notice] [pid 107584:tid 139872587828544] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 10:10:39.261130 2022] [core:notice] [pid 107584:tid 139872587828544] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 10:21:34.086565 2022] [proxy_fcgi:error] [pid 107586:tid 139872327489280] [client 192.168.2.6:54070] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:21:34.086605 2022] [proxy_fcgi:error] [pid 107586:tid 139872327489280] (104)Connection reset by peer: [client 192.168.2.6:54070] AH01075: Error dispatching request to : 
[Thu Mar 10 10:21:34.086700 2022] [proxy_fcgi:error] [pid 107585:tid 139872310703872] [client 192.168.2.28:57788] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:21:34.086712 2022] [proxy_fcgi:error] [pid 107585:tid 139872310703872] (104)Connection reset by peer: [client 192.168.2.28:57788] AH01075: Error dispatching request to : 
[Thu Mar 10 10:21:34.088358 2022] [proxy_fcgi:error] [pid 107585:tid 139872010446592] [client 84.138.167.39:34986] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:21:34.088369 2022] [proxy_fcgi:error] [pid 107585:tid 139872010446592] (104)Connection reset by peer: [client 84.138.167.39:34986] AH01075: Error dispatching request to : 
[Thu Mar 10 10:21:34.106126 2022] [proxy:error] [pid 107585:tid 139872310703872] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:21:34.106146 2022] [proxy_fcgi:error] [pid 107585:tid 139872310703872] [client 192.168.2.28:57788] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:21:34.370000 2022] [proxy:error] [pid 107586:tid 139872327489280] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:21:34.370014 2022] [proxy_fcgi:error] [pid 107586:tid 139872327489280] [client 192.168.2.6:54070] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:21:34.591448 2022] [proxy:error] [pid 107586:tid 139872327489280] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:21:34.591462 2022] [proxy_fcgi:error] [pid 107586:tid 139872327489280] [client 192.168.2.13:56472] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:21:34.592198 2022] [proxy:error] [pid 107586:tid 139872327489280] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /run/php/php7.4-fpm.sock (*) failed
[Thu Mar 10 10:21:34.592207 2022] [proxy_fcgi:error] [pid 107586:tid 139872327489280] [client 192.168.2.13:56472] AH01079: failed to make connection to backend: httpd-UDS
[Thu Mar 10 10:21:34.707190 2022] [mpm_event:notice] [pid 107584:tid 139872587828544] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 10:22:03.072024 2022] [mpm_event:notice] [pid 109091:tid 140609008110912] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 10:22:03.072055 2022] [core:notice] [pid 109091:tid 140609008110912] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 10:40:22.893456 2022] [mpm_event:notice] [pid 109091:tid 140609008110912] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 10:40:23.304369 2022] [mpm_event:notice] [pid 110342:tid 139825484291392] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 10:40:23.304397 2022] [core:notice] [pid 110342:tid 139825484291392] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 10:48:25.643642 2022] [mpm_event:notice] [pid 110342:tid 139825484291392] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 10:48:47.613917 2022] [mpm_event:notice] [pid 111434:tid 140506363239744] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 10:48:47.613949 2022] [core:notice] [pid 111434:tid 140506363239744] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 10:54:32.855681 2022] [proxy_fcgi:error] [pid 111435:tid 140506010580736] [client 192.168.2.13:58412] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:54:32.855713 2022] [proxy_fcgi:error] [pid 111435:tid 140506010580736] (104)Connection reset by peer: [client 192.168.2.13:58412] AH01075: Error dispatching request to : 
[Thu Mar 10 10:54:32.869266 2022] [proxy_fcgi:error] [pid 111436:tid 140506010580736] [client 192.168.2.28:57962] AH01067: Failed to read FastCGI header
[Thu Mar 10 10:54:32.869301 2022] [proxy_fcgi:error] [pid 111436:tid 140506010580736] (104)Connection reset by peer: [client 192.168.2.28:57962] AH01075: Error dispatching request to : 
[Thu Mar 10 10:54:32.872918 2022] [mpm_event:notice] [pid 111434:tid 140506363239744] AH00492: caught SIGWINCH, shutting down gracefully
[Thu Mar 10 10:56:02.824103 2022] [mpm_event:notice] [pid 1044:tid 139793746419008] AH00489: Apache/2.4.52 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Thu Mar 10 10:56:02.908093 2022] [core:notice] [pid 1044:tid 139793746419008] AH00094: Command line: '/usr/sbin/apache2'
[Thu Mar 10 12:08:06.443319 2022] [access_compat:error] [pid 1045:tid 139792989808384] [client 79.192.66.239:59283] AH01797: client denied by server configuration: /var/www/nextcloud/config

(note: probably caused by my testing, after a reboot and waiting 4h no new erros (ignoring the one client deny from the internet) were logged while the problem still persisted)

jbd2 is a kernel process that takes care of the filesystem journaling: https://en.wikipedia.org/wiki/Journaling_block_device.

So basically your IO comes from the MySQL database.

since you switched from an SSD to a HDD which has a terrible random-io performance this seems to be quite plausible.

I don’t think restoring a backup will change this, but you might want to try changing a few parameters for mysql, like innodb-flush-log-at-trx-commit or the dubbewrite buffer.

This will reduce the data safety in case of a power outage but reduces the amount of data written to the disk.

@rtznprmpftl thanks for the reply.
I get the feeling that this could have sth. to do with calendar/tasks sync since those actions take a lot of time while filesyc works moslty as expected. So maybe there is some “fight” over what is the right data after DB recovery while the mobile clients updated calendar events etc.
Any idea how to dig deeper there?

Also I am willing to do some mysql optimizations in exchange for I/O performance like you suggested. I had 1 power outage in the last year and I also do hourly backups of the CalDAV Data so even then not much would be lost (I got daily backups of the normal synced files as well which is good enough for my home setup).
How would i go on about it? Am a bit uncomfortable fiddling around without really knowing what i do there :slight_smile:

Connect to DB, and set innodb-flush-log-at-trx-commit to 1, innodb_flush_log_at_timeout to 5 and innodb_doublewrite to 0?
In that case from my understanding i COULD loose data from the last 5 secs if power goes out or mysql is killed, but not if the system is rebooted normally right?
Can you tell me the syntax aswell?
i suspect sth. like that:

sudo mysql -u root -p
set innodb-flush-log-at-trx-commit = 1;
set innodb_flush_log_at_timeout = 5;
set innodb_doublewrite = 0;

I don’t use the Calendar sync, so i don’t really know what can be done there.

Some innodb options are non-dynamic, so you have to set them in the mysql config file: e.g. MySQL :: MySQL 8.0 Reference Manual :: 15.14 InnoDB Startup Options and System Variables

Also, create daily sql dumps of the database to be sure!

I also found that manual but I do not know how i can set this variable, which config file or via sql statement … thats why i was asking for syntax, or better (now that i re-read it) commands to set this variables.
Also is my assumption true about what those commands do when I set them like shown above?

I actually already do daily dumps:

mysqldump --single-transaction -h localhost -u $LOGIN -p$PASSWORD $DATABASE > NextCloud_DB_Backup.bak

The system also was restored using those dumps, so the system works :wink:

I found sth. that might have caused the issue based on this thread here:

So i connected to the DB and a delete from oc_authtoken; removed 400 entrys.
After that syncing went normal again but as i found out syncing (e.g. calendar/contacts via opensync, but also nextlcoud linux desktop client) will create several new authtokens per sync so i was back on 60+ very soon. It also already felt slower, but that might be imagination. It went up as high as around 200, then went down to 90 again so it does not seem to get mangled up again (for now).
I am a bit confused because the migration was serverside, the clients did not change, so this issue should have come up with the old system as well …
Any Ideas?

Ok some time has passed and the issue still persists, I manually have to delete entrys from oc_authtoken from time to time to be able to login.

select id, uid, login_name, name, type, remember, FROM_UNIXTIME(last_activity), FROM_UNIXTIME(last_check) from oc_authtoken;

revealed that most of the 980 entrys in oc_authtoken were from calendar or contact syncing via caldav, so i removed those.

delete from oc_authtoken where name LIKE ‘%nextcloudcmd%’;
delete from oc_authtoken where name LIKE ‘%OpenSync%’;
delete from oc_authtoken where name LIKE ‘%iOS%’;

This cut the number of authtokens down to 16 and login worked again.
But this should not be the normal behaviour and i consider this a dirty hack instead of a solution.

But I just found Hundreds of odd oc_authtoken entries for a user slow down performance · Issue #27603 · nextcloud/server · GitHub which seems to tackle the problem by using dedicated app passwords for syncing with the different apps, so I will try that out :slight_smile:

Ok using dedicated app passwords for syncing with the different apps seems to do the trick, so I will mark this as the sollution.