Through hours of testing, I have found that the nextcloud desktop sync client for ubuntu 20.04 (appimage or ppa) both seem to have a bug to where… if a common nextcloud file sync error occurs , kswapd0 spikes to 100% of CPU and the swapfile on Debian 10.5 server becomes completely filled. (clamscan also spikes 45% to 100% during kswapd0’s climb to 100% of cpu). My other sync clients do not cause this problem (mobile, ubuntu native “online accounts”) .
top command output
top - 16:08:59 up 22 min, 2 users, load average: 89.42, 84.04, 55.66
Tasks: 378 total, 12 running, 359 sleeping, 0 stopped, 7 zombie
%Cpu(s): 3.4 us, 57.0 sy, 0.0 ni, 0.1 id, 39.5 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3946.8 total, 90.2 free, 3766.4 used, 90.1 buff/cache
MiB Swap: 6144.0 total, 0.0 free, 6144.0 used. 4.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
36 root 30 10 0 0 0 R 98.3 0.0 12:43.68 kswapd0
1691 mysql 20 0 1739540 2376 0 S 3.9 0.1 0:34.59 mysqld
1300 root 10 -10 116752 3400 0 D 3.3 0.1 0:41.96 AliYunDun
1544 root 20 0 806108 640 0 D 2.4 0.0 0:09.45 aliyun-service
161 root 20 0 4556 1904 1844 S 0.9 0.0 0:10.60 plymouthd
2746 git 20 0 1374728 6020 0 S 0.7 0.1 0:07.23 gitea
1114 root 20 0 24312 284 0 S 0.5 0.0 0:03.74 AliYunDunUpdate
5805 web2 20 0 292472 215456 920 D 0.4 5.3 0:05.43 clamscan
155 root 0 -20 0 0 0 I 0.3 0.0 0:07.11 kworker/0:1H-kbl+
232 root 20 0 70888 284 88 D 0.3 0.0 0:03.74 systemd-journal
936 memcache 20 0 408168 0 0 S 0.3 0.0 0:02.19 memcached
3492 root 20 0 11380 756 556 R 0.3 0.0 0:03.28 top
1 root 20 0 170192 2972 0 D 0.3 0.1 0:11.03 systemd
1041 redis 20 0 54244 428 0 D 0.3 0.0 0:03.28 redis-server
4029 www-data 20 0 339376 2436 16 D 0.3 0.1 0:00.85 /usr/sbin/apach
I have tried using nice and cpulimit to prevent kswapd0 from reaching 100% and completely consuming the swap memory… but kswapd0 seems to just power through both commands whether run individually or simultaneously and consumes 100% of CPU and swap, leaving me no choice but to reboot the server in order to clear the swap cache re-gain use of other services.
I have already reduced swapiness to zero. And I have tried:
To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
echo 3 > /proc/sys/vm/drop_caches
As I figure nextcloud file sync errors will be a common thing in the future, might someone be able to suggest how I can mitigate / prevent a simple file sync error from taking down my entire server?
Steps to reproduce:
1.Install install Desktop client for Ubuntu & Win10 (dual boot machine)
2. Sync calendars in Linux TB & Android mobile everything should work fine.
3. QOwnNotes generates a common sync error when using git tracking…
4. a few hours later log into to ubuntu (launching NC sync client at boot), ssh to server, and check #top
output.
Things that I have tried to troubleshoot the problem:
-
If I use #killall -9 kswapd0
, and stop apache server and mysqld, I can get enough speed so that the terminal is usable. But note, this is temporary and kswapd0 spikes back to 100% 18 to 10 minutes later (or faster) - I tried clearing the swap cache by turning swapon and swapoff. NOTE
#swapon -a && swapoff -a
will NOT work and will only result in an errorswapoff: /swapfile: swapoff failed: Cannot allocate memory
. I ultimately had to use these instructions as nothing else worked… (essentially, create a second swap, absorb whatever processes from the original swap, then swapoff/swapon the original swap, and finally remove the second swap)
3.) I edited the/etc/sysctl
file and set swapiness to 0
4.) I disabled / removed all sync devices from NC admin/security and re-enabled them 1 by 1… thunderbird and mobile sync client seem to have no problem, but when I re-enabled linux Desktop (appimage or ppa) sync client… BANG problem came back.
Does anyone have any tips on how to resolve this problem?
UPDATE
After some additional testing and reading… it seems that ClamAV is running clamscan on every upload and email which spiking CPU usage to 100%. The relation to nextcloud is that I have anitvirus for files activated. Therefore, my file sync uploads also start clamscan as well, then overload the server.
The solution seems to be stop using clamscan but instead implement clamav-daemon. I am researching the problem now, but if someone can tell me how to switch from clamscan to clamav-daemon. I would appreciate it.