Freezes entire server computer. Caused by "/var/www/nextcloud/cron.php"

Details
Nextcloud version: 21.0.0
Operating system and version: Ubuntu 20.04.2
Apache or nginx version: Apache 2.4.41
PHP version: 7.4.3

No system SWAP is set. UPDATE: I added 20GB of swap, but now it uses about 1GB of swap, jitters for a while before freezing and causing screen to go black.

Is this the first time you’ve seen this error?: Y

Steps to replicate it:

  1. Use cron
  2. Updated from 20.0.8 to 21.0.0

I regained temporary access to the server by renaming “/var/www/nextcloud/cron.php” to clams.php

Issue
The computer hosting the nextcloud server will freeze for 20-30 minutes before unfreezing for a few seconds.
I pulled up htop, and "php -f /var/www/nextcloud/cron.php" is the command that causes the memory to spike to max (8gb) and CPU to also spike.

Logs
As for nextcloud logs, all I see under Warning/Error are

Invalid data provided to provideInitialState by files
...
RedisException: read error on connection to localhost:6379
...
Doctrine\DBAL\Query\QueryException: More than 1000 expressions in a list are not allowed on Oracle.
...
Error: Undefined index: settings at /var/www/nextcloud/lib/private/AppFramework/Bootstrap/RegistrationContext.php#404
Error: Call to a member function getContainer() on null

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):

<?php
$CONFIG = array (
  'instanceid' => 'random-letters',
  'passwordsalt' => 'random-letters',
  'secret' => 'random-letters',
  'trusted_domains' => 
  array (
    0 => '192.168.254.14',
    1 => 'my-external-ip',
  ),
  'datadirectory' => '/var/www/nextcloud/data',
  'dbtype' => 'mysql',
  'version' => '21.0.0.18',
  'overwrite.cli.url' => 'http://192.168.254.14/nextcloud',
  'dbname' => 'nextclouddb',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => 'nextcloud-user',
  'dbpassword' => 'a-password',
  'installed' => true,
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => 'localhost',
    'port' => 6379,
  ),
  'updater.release.channel' => 'beta',
  'app_install_overwrite' => 
  array (
    0 => 'occweb',
    1 => 'joplin',
    2 => 'whiteboard',
  ),
  'ldapIgnoreNamingRules' => false,
  'ldapProviderFactory' => 'OCA\\User_LDAP\\LDAPProviderFactory',
  'maintenance' => false,
  'theme' => '',
  'loglevel' => 0,
  'updater.secret' => 'idk',
);

The output of your Apache/nginx/system log in /var/log/____:

[Tue Mar 30 00:00:33.151849 2021] [ssl:warn] [pid 911] AH01909: 127.0.1.1:443:0 server certificate does NOT include an ID which matches the server name
[Tue Mar 30 00:00:33.152094 2021] [mpm_prefork:notice] [pid 911] AH00163: Apache/2.4.41 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Tue Mar 30 00:00:33.152101 2021] [core:notice] [pid 911] AH00094: Command line: '/usr/sbin/apache2'
[Tue Mar 30 02:36:45.305034 2021] [php7:error] [pid 5398] [client 18.231.155.124:52686] script '/var/www/html/wp-login.php' not found or unable to stat
[Tue Mar 30 04:56:38.838443 2021] [ssl:warn] [pid 770] AH01909: 127.0.1.1:443:0 server certificate does NOT include an ID which matches the server name
[Tue Mar 30 04:56:43.637368 2021] [ssl:warn] [pid 922] AH01909: 127.0.1.1:443:0 server certificate does NOT include an ID which matches the server name
[Tue Mar 30 04:56:43.640277 2021] [mpm_prefork:notice] [pid 922] AH00163: Apache/2.4.41 (Ubuntu) OpenSSL/1.1.1f configured -- resuming normal operations
[Tue Mar 30 04:56:43.640300 2021] [core:notice] [pid 922] AH00094: Command line: '/usr/sbin/apache2'
[Tue Mar 30 06:51:28.143581 2021] [authz_core:error] [pid 2698] [client 46.254.20.36:64880] AH01630: client denied by server configuration: /var/www/html/.htpasswd
[Tue Mar 30 16:40:42.714568 2021] [php7:error] [pid 2539] [client 157.245.125.95:59460] script '/var/www/html/wp-login.php' not found or unable to stat
[Tue Mar 30 18:59:48.103426 2021] [access_compat:error] [pid 987] [client 192.168.254.15:57856] AH01797: client denied by server configuration: /var/www/nextcloud/data/.ocdata
[Tue Mar 30 19:00:00.486168 2021] [access_compat:error] [pid 986] [client 192.168.254.15:57853] AH01797: client denied by server configuration: /var/www/nextcloud/data/.ocdata

If you need any more information, I’ll try to reply as quickly as possible.

I found this in combination with the mailapp:

and there is a github issue: NC21, postgres, Doctrine\DBAL\Query\QueryException: More than 1000 expressions in a list are not allowed on Oracle. · Issue #4584 · nextcloud/mail · GitHub

If you use redis, there is less stuff handled by the database and the system gets faster. But that more in general, not sure if that changed anything for this particular problem.

First, I’m sorry, this is not a helpful response, which is what I suppose you were expecting and hoping for when you posted here :slight_smile:
However, it’s the next best thing I think and that is someone who is also experiencing your issue which means you’re not alone :wink:

Almost to the letter I experience this issue every time I try to upgrade to NC 21.

I have tried using the web updater which only works one out of every 20 times I try it for some reason known only to itself! In this instance however it updated first try which was exciting. On first run NC was absolutely fine although I could not detect any sort of speed up on the web interface which was one of the main reasons I took the plunge to update.
Anyway, around 30 minutes after the upgrade I realised that the interface had frozen and, subsequently, also realised that the entire server was inaccessible.
When I eventually got access via ssh I saw, via htop, that it was cron that was using 85% CPU time and specifically the www-data cron job.
I rebooted and tried again with exactly the same results. I suppose that the nextcloud cron job was running and doing a file scan also as I have it set to do these to keep new files accessible.
In the end I gave up on the upgrade and restored a rescue backup which I make before every upgrade of anything and was back to NC20.
Then I thought I would try a manual upgrade in case there were corrupt files in the old folder. This went well and the server ran quite well (still no sight of the web U.I. speedup) for a couple of hours. Then I invoked a manual occ files:scan and the whole thing froze again.
So, I have decided that this is what freezes my setup at least, any instance of occ files:scan.
I’m not clear on whether any other occ commands may have the same results as I removed NC21 again and am back to NC20 which works.
At this point I am stumped. I’m not sure what is going on but presume it’s something wrong in my configuration?

The only clue I have is that NC20 is currently complaining about my not having php.imagick installed when I actually do which might indicate some basic php problem or other but I am not smart enough or knowledgable about php to investigate this on my own. As long as NC20 is working (which it seems to be) then I am best to leave php alone. I remember what happened the last time I poked around with that unnecessarily!
so, if anything I can provide as a comparison to your own setup will help feel free to ask and I will try to provide that. If not, I guess we will have to await from knowledgable input.

My server is running on Ubuntu 20.10 and apache2/php7.4 As far as I can remember I installed Redis also and, up until recently, on my NC system check page I had a green tick :slight_smile:
James

1 Like

Same here….

I still used flock to prevent cron.php spawning multiple instances every 10 minutes… But this changed nothing to the problem. The cron.php process is in deep sleep forever…

root 1 0.0 0.0 200 4 ? Ss Jun16 0:00 s6-svscan -t0 /var/run/s6/services
root 39 0.0 0.0 200 4 ? S Jun16 0:00 s6-supervise s6-fdholderd
root 368 0.0 0.0 200 4 ? S Jun16 0:00 s6-supervise nginx
root 369 0.0 0.0 200 4 ? S Jun16 0:00 s6-supervise php-fpm
root 370 0.0 0.0 200 4 ? S Jun16 0:00 s6-supervise cron
root 373 0.0 0.0 5176 3492 ? Ss Jun16 0:00 nginx: master process /usr/sbin/nginx -c /config/nginx/nginx.conf
root 374 0.0 0.6 256404 26492 ? Ss Jun16 0:09 php-fpm: master process (/etc/php8/php-fpm.conf)
root 375 0.0 0.0 2820 1256 ? Ss Jun16 0:00 bash ./run
root 389 0.0 0.0 2328 816 ? S Jun16 0:01 /usr/sbin/crond -f -S -l 0 -c /etc/crontabs
abc 395 0.0 0.0 5376 2784 ? D Jun16 0:00 nginx: worker process
abc 396 0.0 0.0 5376 2784 ? D Jun16 0:00 nginx: worker process
abc 397 0.0 0.0 5376 2784 ? D Jun16 0:00 nginx: worker process
abc 398 0.0 0.0 5376 2784 ? D Jun16 0:00 nginx: worker process
abc 399 0.0 0.1 256400 6516 ? S Jun16 0:00 php-fpm: pool www
abc 400 0.0 0.1 256400 6516 ? S Jun16 0:00 php-fpm: pool www
abc 1393 0.0 0.7 256380 27636 ? D 11:20 0:00 php8 -f /config/www/nextcloud/cron.php
root 2009 0.0 0.0 2984 1736 pts/1 Ss 18:12 0:00 bash

It seems, that I am having the same problem, and not idea how to solve it.

Can someone give me a hint, how I can trace the runtime of /var/www/nextcloud/cron.php?

For me it was a hardware problem with an external HDD.
I migrated all the data to the internal HDD and never had this problem again…

Hello @Dominik-1980,

meanwhile I realized, that also for me it was a hardware problem. I cloned the SD card to a new one with “balenaEtcher”, and the error was gone.

Thank you for the hint.

superma