The cron.php tasks run for hours, constant CPU load, queries on oc_filecache

Perhaps a php-configuration is not good and only cron.php makes problems. Perhaps check your php-configuration (memory, …).

Also you can perhaps switch for a few days from cron.php to AJAX in the nextcloud-gui.
(Administration -> Basic settings). I use additional a hosted nextcloud with AJAX.

On the same site you can find information to your background jobs.

It doesn’t look like memory is the problem. Running PHP from a command line, memory is only constrained by the physical memory, which is 1GB. I can’t think of a PHP configuration that would create a problem without causing a crash. I was trying to get away from Ajax as it clearly must have a detrimental effect on user performance! It wasn’t spectacular before. Use of cron is recommended in the documentation (and the admin UI says to run every 5 minutes). But I’ve switched to Ajax - not sure if there is a way to get it on to cron.

Ran for some hours on Ajax. Now been running for several hours at near 100% cpu. Same SQL query being run all the time. SELECT path FROM oc_filecache WHERE ( storage = 3) AND ( size < 0) ORDER BY fileid DESC.

Looks to be stacking up PHP instances:

nextclo+  6980  0.5  6.4 338136 65284 ?        S    14:50   2:26 php-fpm: pool nextcloud
nextclo+  7240  0.3  5.1 313584 52120 ?        S    14:51   1:29 php-fpm: pool nextcloud
nextclo+  7283  0.3  5.1 313540 52260 ?        S    14:51   1:25 php-fpm: pool nextcloud
nextclo+  8510  0.3  5.6 389180 57512 ?        S    14:55   1:25 php-fpm: pool nextcloud
nextclo+  8935  0.3  5.8 389152 59640 ?        S    14:57   1:20 php-fpm: pool nextcloud
nextclo+ 15581  0.2  4.5 288864 45528 ?        S    21:44   0:02 php-fpm: pool nextcloud
nextclo+ 15603  0.5  4.5 288868 45880 ?        S    21:44   0:05 php-fpm: pool nextcloud
nextclo+ 19349  0.0  0.7 281908  7744 ?        S    21:59   0:00 php-fpm: pool nextcloud
nextclo+ 19366  0.3  5.7 388060 58356 ?        S    15:37   1:10 php-fpm: pool nextcloud
root     19598  0.0  0.0   6212   876 pts/2    S+   22:00   0:00 grep nextcloud
nextclo+ 28582  0.2  5.1 313540 52132 ?        S    18:21   0:36 php-fpm: pool nextcloud
nextclo+ 32306  0.3  5.4 313836 55148 ?        S    16:29   1:03 php-fpm: pool nextcloud

Restarting PHP-FPM clears out the long running processes, but doesn’t really solve the problem.

I can confirm the described issue. Did you check the GitHub issue tracker?

No, but I’ve looked now. The entry https://github.com/nextcloud/server/issues/18960 looks as if it is similar, doesn’t seem to be the same. It doesn’t appear to have made any progress.

1 Like

Did you check the database itself for errors/corruption, slow queries and such? oc_filescache is known to be large, but what is the size of database file(s) (in relation to amount of actual handled files in NC and found files in NC data dir)?
Also mysqltuner or similar tool can help to check for insufficient cache sizes and such.
And of course dmesg to check for file system errors.

In my case, it helped recuding the cron frequency from every 5 mins to 15 mins and adding flock to avoid multiple runs in case the previous has not exited:
/var/spool/cron/crontabs/www-data:
*/15 * * * * flock /tmp php -f /var/www/html/cron.php

I’m facing the similar problem.
I started using the PreviewGenerator an generated the 32, 64, 256 and 2048 px previews for my 440 GB files (113 GB is my photo libary).
With this my oc_filecache table grown to a size of 420 MB.

I saw that the cron-job calls this (long running) query, like in your case:

SELECT `path` FROM nextcloud.oc_filecache WHERE `storage` = 1 AND `size` < 0 ORDER BY `fileid`

So I discovered that this query needs a lot of time. Here my slow-query output of mysql:

# User@Host: ncuser[ncuser] @ localhost []  Id: 21805
# Query_time: 253.154251  Lock_time: 0.000205 Rows_sent: 191412  Rows_examined: 1454739
SET timestamp=1603198875;
SELECT `path` FROM `oc_filecache` WHERE (`storage` = 1) AND (`size` < 0) ORDER BY `fileid` DESC;

Now I will also try to reduce the cron frequency from 5 mins to 15 mins and use the flock hint.

Similar problem here. My shared hosting provider complains about excess CPU usage even though I only use Nextcloud for my family with very low usage. I also use PreviewGenerator. Unfortunately not technical enough to give much more info. My hosting provider is threatening to kick me off their service… Cron every 15 minutes. Any suggestions on what to try (with good instructions !)

I started the same journey, not yet finished, but one issue was some symlinks which broke cron.php

Anyhow, I am still having situations where the load of my server runs quite high, because of multiple crons running at the same time.
If I use something like flock, it happened that crons do not run for days, because of the “one” which never finishes. Without flock, at a certain point mysqld starts freaking out a little :wink:

But I would say a “starting point”
https://help.nextcloud.com/t/some-cron-php-threads-never-finish-result-high-cpu-usage-because-of-mysqld/

Hey @h3rb3rt,
it looks like that your situation (file scan / symlinks) is maybe the same result but a different reason.

Also for me it didn’t worked with flock. I had the same issues.

I made some tests at my database why this query is so slow.
The short version:
For this query the wrong index is used.

Longer version:
We talk about this query:

SELECT `path` FROM nextcloud.oc_filecache WHERE `storage` = 1 AND `size` < 0 ORDER BY `fileid` DESC

I asked the MySQL-Server to explain me, how he will execute this query by prepend EXPLAIN:

# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
'1', 'SIMPLE', 'oc_filecache', NULL, 'ref', 'fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size', 'fs_storage_path_hash', '8', 'const', '788173', '33.33', 'Using where; Using filesort'

The correct index would be fs_storage_size. But it is using the fs_storage_path_hash.
I’m not sure why…
On a fresh installed Nextcloud 20 test instance with only the default data, the right index is used.

If I force the used index to fs_storage_size the query is executed in 3.654 sec.
So this is a huge difference to 253 sec.

I will try to write a GitHub issue in the next days.

1 Like

Ok, this is a little too much magic for me :wink:
Reading this I have no clue how and if I could verify this on my instance at all, but after installed some extended monitoring I see that after the cron starts (still not knowing what it exactly does all the time) the IO to my mounted NFS storage where the data is located gets extremly high and stays there as long as the job runs.
I need to kill the cron to fix the issues, this happens twice a day.

But running the oc files:scan --all is finished within 30 minutes.

This bugs me a lot the last few days.

1 Like

Multiple topics about long running cron jobs/high CPU usage and pretty much no response from anybody except for the multiple people who are having the problem. Is anyone working on this?

While I love Nextcloud, this lack of support/responsiveness impedes Nextcloud’s ability to become a mainstream solution for people other than enthusiasts and/or highly technical people.

Some response to this/these potential problems asap would be appreciated by many.

2 Likes

I finally found time to write a issue on GitHub:

3 Likes

Thank you ! I just realized a couple weeks ago that my cron.php via webcron was not working anymore. Set up cron and found out that it tends to run endlessly and multiple times. It does eat all of my mariadb server’s cpu on this request…

SELECT path FROM oc_filecache WHERE (storage = 879) AND (size < 0) ORDER BY fileid DESC

But sometimes the task does get through… Hopefully we’ll get an answer at some point!

To fix the problem temporarily I added a “timeout 30m flock /tmp/cron.lock” before the cron.php command in crontab so if a task lingers it lets it run for 30 minutes while queueing the other instances and then kills it. Should relieve my mariadb server :slight_smile:

1 Like

Quick follow up after 2 weeks : lowered the timeout to 10mn and the problem seems to be contained. Still there but has a low to no impact on my database server now.

Upgraded last night to NC 20.0.3. Will check if the problem still exists.

1 Like

Hello, 20.0.3 here, still experiencing the same problem. Endless cron with mariadb eating 150% cpu.
Using preview-generator app, seems to be the causing this behavior

Anybody knows if there is the same behaviour with nextcloud-preview.sh https://github.com/GAS85/nextcloud_scripts/blob/master/nextcloud-preview.sh ?