Background jobs never end, high cpu mysql, after update 17.0.3 -> 18.0.1

Nextcloud version (eg, 12.0.2): 18.01
Operating system and version (eg, Ubuntu 17.04): ubuntu 18.04
Apache or nginx version (eg, Apache 2.4.25): Apache 2.4.29
PHP version (eg, 7.1): 7.3.15

The issue you are facing:
After update from 17.0.3 to 18.0.1 background jobs don’t finish. After a while multiple high-cpu mysql processes are running because either cron or Ajax (depending on what I’ve set) keep on starting new processes.

I’ve spent quite some time troubleshooting this issue, but I am at a loss by now. I hope someone can point me in the right direction!

  • Nextcloud log contains thousands of: “class”:“OC\BackgroundJob\JobList”,“type”:"->"},{“file”:"/var/www/nextcloud/lib/private/BackgroundJob/JobList.php",“line”:213,“function”:“getNext”,“class”:“OC\BackgroundJob\JobList”,“type”:"->"},{“file”:"/var/www/nextcloud/lib/private/BackgroundJob/JobList.php",“line”:213,“function”:“getNext” ETC ETC
  • othertimes the same, but then with line 227 referenced
  • mysql error log is full of: InnoDB: page_cleaner: 1000ms intended loop … settings not optimal
  • I’ve used mytop to see what happes in mysql, lots of: Query Creating SELECT * FROM oc_jobs WHERE (reserved_at <= 1583027802) AND (last_checked <= 1583071002)
  • I’ve disabled en reanabled most apps (as line 227 in joblist.php references disables app), no avail
  • I’ve switched to php 7.4, which did not solve this, but gave new issues, so I’ve switched back to php7.3
  • I’ve switched from apcu to redis, which did improve performance of, for instance, occ files:scan -all, but did nothing to the described issues of cron jobs never ending
  • I’ve run occ files:scan multiple times, no errors, but also no change in main issue
  • I’ve cleaned oc_files_locks multiple times, seems fine, did not solve main issue

Is this the first time you’ve seen this error? (Y/N): Y

Steps to replicate it:

  1. start server
  2. wait or, when selectec AJAX for background jobs, load a page or more
  3. more and more background jobs will start that run forever, in the end the system grinds to a halt

The output of your Nextcloud log in Admin > Logging:

See description above

The output of your Apache/nginx/system log in /var/log/____:
Found nothing strange here (apache, php fpm, etc, happy to post more logs if someone is interested)

What is your Cronjob command line?

I’ve this in my www-data crontab (sudo -u www-data crontab -e):
*/15 * * * * php -f /var/www/nextcloud/cron.php

By the way, this has worked for years (from NC 10 on).

Try replace that by

*/15 * * * * run-one php -f /var/www/nextcloud/cron.php

(add ‘run-one’, see https://manpages.ubuntu.com/manpages/trusty/man1/run-one.1.html)

Hi, thanks.

This does help against the endless spawning of new cronjobs.

But it does not solve the problem of the background jobs running forever.

I’ve tried today (after editing out the whole cron job) how long a manual cronjob would take. After 6 hours it stopped. Without errors as far as I can see.

But I was unable to figure out what background job was doing what, and why it took so long.

Any thoughts on how to troubleshoot that?

I’ve gone through https://docs.nextcloud.com/server/18/admin_manual/installation/index.html again and checked apache, php, mysql, cron, whatever configuration I could find. But everything seems fine.

The server itself runs fine as well. It really is only that background jobs run forever, causing high CPU with ever more mysql processes. I can stop stat issue by calling, but disabeling, cron, but in that case no background jobs will run.

Is there someone who knows how to troubleshoot this?

SOLVED! Okay, I finally figured that in the oc_jobs table there could be a (long) list of scheduled jobs. (From the logs it seemed that mysql was just constantly selecting new jobs) And indeed: > 250000 jobs of the ‘maps’ app.

Just emptied the whole table, everything is fine again.

NB: note that this also deletes all regular cron jobs. Update to a new (minor or major) version, or reinstall solves that.

2 Likes

This also works on alpine images:
*/15 * * * * flock /tmp php -f /var/www/html/cron.php

Had this problem recently. News app was broken as the maps app was taking up all the cronnage.

Can’t remember where I got this from but you can delete maps specific cron jobs with this command:
DELETE FROM oc_jobs WHERE class LIKE ‘%Maps%’

The response was: Query OK, 35872 rows affected (0.696 sec)

It worked, and my news feeds are able to update as well. TBH the maps was a nice little app to play around with, I don’t mind losing this functionality.

I did the same, clearing the oc_jobs (+250k entries), am at the latest version and crons run within seconds now. You said I need to redo/update my instance, but I am at the latest (came from 16). But what am I missing now?