Cron.php hangs causing 100% cpu load

Support intro

Sorry to hear you’re facing problems :slightly_frowning_face:

help.nextcloud.com is for home/non-enterprise users. If you’re running a business, paid support can be accessed via portal.nextcloud.com where we can ensure your business keeps running smoothly.

In order to help you as quickly as possible, before clicking Create Topic please provide as much of the below as you can. Feel free to use a pastebin service for logs, otherwise either indent short log examples with four spaces:

example

Or for longer, use three backticks above and below the code snippet:

longer
example
here

Some or all of the below information will be requested if it isn’t supplied; for fastest response please provide as much as you can :heart:

Nextcloud version (eg, 20.0.5): 21.0.2
Operating system and version (eg, Ubuntu 20.04): Raspbian GNU/Linux 10 (buster)
Apache or nginx version (eg, Apache 2.4.25): nginx/1.14.2
PHP version (eg, 7.4): v8.0.5

The issue you are facing:
cron.php script hangs causing cpu usage of 100%

Is this the first time you’ve seen this error? (Y/N): Y

Steps to replicate it:

  1. Just run cron.php every 5 mins in cron job

On first run and then every 12 hrs cron.php hangs and will run forever unless I manually kill it. It runs my cpu at 100%. All other times it runs fine.

I’ve checked my date.timezone in /etc/php/8.0/fpm/php.ini and /etc/php/8.0/cli/php.ini. And it’s set correctly. I’ve set the correct time zone for logging: timedatectl set-timezone

I’ve checked nextcloud log and it’s clean.

Attached strace to the process and this set of lines keeps repeating indefinitely until I kill it with Ctrl C

strace: Process 7485 attached
gettimeofday({tv_sec=1621744604, tv_usec=617061}, NULL) = 0
gettimeofday({tv_sec=1621744604, tv_usec=617257}, NULL) = 0
gettimeofday({tv_sec=1621744604, tv_usec=617765}, NULL) = 0
openat(AT_FDCWD, “/var/nc_data/nextcloud.log”, O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 8
fstat64(8, {st_mode=S_IFREG|0640, st_size=516978614, …}) = 0
_llseek(8, 0, [0], SEEK_CUR) = 0
_llseek(8, 0, [0], SEEK_CUR) = 0
write(8, “{“reqId”:“u0xEDjzmdegIXMaIQmZz”,”…, 1768) = 1768
close(8)

cron.php runs under the php process. This started with php 8 so I surmise it may be an obscure bug with php 8. I noticed this issue because my fanshim LED turns red when my server is hot and it was red with no load so I investigated. If not for the red LED I would have been unaware of the issue. As such, I’m curious if others are experiencing this issue, but are unaware.

For others running php 8, I urge you to check sudo htop once a day and sort by CPU to check if it is hanging. If so, you should see it similar to the image posted above. If so, please report on this thread. Thank you.

Seeking assistance troubleshooting. Thank you!

Do you use APCU?

Yes, I’m using APCU, but I can run occ commands fine from the command line so I’m missing the connection to my issue. Also, FYI, when I setup my nextcloud server I issued this line: sed -i '$aapc.enable_cli=1' /etc/php/8.0/mods-available/apcu.ini so all good.

I don’t know the effects of not running cron.php, but for now, I have no choice but to turn off cron.php as I can’t have it hang twice a day and burn my CPU at 100% forever unless I manually kill it. It’s impractical for me to be around to kill cron.php twice a day.

I"m totally baffled!

Or just don’t run cron automatically just from time to time. You need to check the logs a bit. There are already topics around php8 incompabilities:

For a bug report, more specific errors from logfiles would be helpful. What is strange that your strace shows an error with the logfile, is this always the same or just by chance?

Yes, when I connect the running process to strace, the output is always the same set of lines posted above. I did also notice the error with nextcloud.log file. You asked for additional log files. Specifically which ones would be helpful? And I will happily post.

I have created a “hack” workaround for this. It shouldn’t be necessary, but it works for now.

sudo crontab -e

check every 5 mins if cron.php is running for > 5 mins and if so kill it
*/5 * * * * killall -q -u www-data --older-than 5m php

I decided to comment out the above killall workaround and alternatively try another. I installed php 7.4 and set it as default so cron.php will run under php 7.4

Nextlcoud will continue to use php 8 for performance, but cron.php will use php 7.4.

I’m testing whether cron.php will hang if run under php 7.4 and will follow up with the results in a day or so as the hang happens about once every 12 hrs.

Update: running cron.php under php 7.4 did not resolve the issue. Again this morning after approximately 144 successful runs (12 hrs X 60), cron.php hangs again. I’ve now confirmed it happens under php 7.4 and php 8. This is a serious issue as it will run forever and probably eventually toast my CPU. It requires intervention to manually kill it.

strace shows this output:

strace: Process 11073 attached, infinitely repeats this set of lines:

strace: Process 11073 attached
gettimeofday({tv_sec=1622041051, tv_usec=683497}, NULL) = 0
openat(AT_FDCWD, “/var/nc_data/nextcloud.log”, O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 7
fstat64(7, {st_mode=S_IFREG|0644, st_size=766162626, …}) = 0
_llseek(7, 0, [0], SEEK_CUR) = 0
_llseek(7, 0, [0], SEEK_CUR) = 0
write(7, “{“reqId”:“some long string”,”…, 1772) = 1772
close(7) = 0
gettimeofday({tv_sec=1622041051, tv_usec=685381}, NULL) = 0
gettimeofday({tv_sec=1622041051, tv_usec=685592}, NULL) = 0
gettimeofday({tv_sec=1622041051, tv_usec=685941}, NULL) = 0
openat(AT_FDCWD, “/var/nc_data/nextcloud.log”, O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 7
fstat64(7, {st_mode=S_IFREG|0644, st_size=766164398, …}) = 0
_llseek(7, 0, [0], SEEK_CUR) = 0
_llseek(7, 0, [0], SEEK_CUR) = 0
write(7, “{“reqId”:“some long string”,”…, 1772) = 1772
close(7)

I’m totally baffled as to why this is happening.

Your nextcloud.log is not empty I suppose?

Or it is full and tries to rotate it? You can also start the cron.php from command line, perhaps with -vvv you can make it more verbose and it shows you where it fails…

The log is not empty, but is set to rotate:

-rw-r----- 1 www-data www-data 70273 May 26 10:35 nextcloud.log
-rw-r–r-- 1 www-data www-data 1531475418 May 26 08:15 nextcloud.log.1

Is there a way to capture the output from the cron job which is running cron.php every 5 minutes?

There is a table in the database with all the jobs for cron. If you have ‘exotic’ things from apps that few people are using, I’d disable these apps temporarily.

My oc_jobs table has 60 rows. I’m not using any “exotic” apps that I know of. Is 60 rows about what is expected?

My joblist is at 25 entries (most of them an upload folder cleanup for one specific user…). But you can see which job was run for the last time, so perhaps you can find at which job it usually stops since the jobs behind are not executed any more.

I see the last run col, but it’s a number. How can they be converted to a date?

Can I safely delete all rows in the oc_jobs table? Will the apps recreate the necessary rows?

Unless there is an obvious problem, I wouldn’t do that. If it is related to an app, I would rather disable it and enable it later. Not sure what happens with other crojobs if you delete them, if you can restore them or if you need to retrieve from a backup.

Well, looks like we are getting somewhere. I cleared out the log file, set logging level to 1 and manually ran the cron.php. As expected it hung, but this time I caught something in the trap. It ran for only a few mins at most, before I did a Ctrl C to kill it.

I checked the nextcloud log and…

The bookmarks app is suspect as the cause of the hang. This error occurred over 531,000 times in the log! Wow!

I then manually ran cron.php again and it hangs. I ran it a third time and no hang. This is still the mystery. Why would it not hang all the time? Why hang, then fine for approx 12 hrs?

When it runs fine, no log errors of course. But the code in cron.php looks like ti should be writing a “core” with “lastcron” to the logfile. I checked the logs and don’t see it.

I will disable bookmarks app for a day and see what happens with cron.php

Bug report has already been filed with bookmarks

This is the fix per bookmarks developer:

As a workaround, this fix works until the issue is fixed.

Temporarily manually remove these 3 jobs from the oc_jobs table:

OCA\Bookmarks\Lib\BackgroundJobs\PreviewsJob
OCA\Bookmarks\BackgroundJobs\PreviewsJob
OCA\Bookmarks\BackgroundJobs\CrawlJob

Whenever you re-enable the app, the jobs will be parsed again from info.xml and added to the database again if they’re not there, yet

Correction, this is not the fix, this is a workaround. I will release a real fix with the next patch soon. :slight_smile:

v4.2.2 is out now, which should fix this.