Clarification Regarding Cron Jobs & Setup/Config

Something happened on the fist attempt to post this, so sorry or the duplicate entry.

Before setting up my Docker based NextCloud server I am trying to gain some more understanding about background jobs in NextCloud, their role and and proper configuration. From what I have read the background jobs are an important component to running a NextCloud instance efficiently and the preferred method for for executing them is using Cron. I also found this nice blog on various ways to set up cron jobs. I really hope someone can help me answer some of the (quite possibly noob-) questions below so that I can better understand this topic.

  1. From what Iā€™ve gathered all the background jobs are defined in cron.php. Does this file contain a set of ā€œdefaultā€ tasks that should be run regardless of the NextCloud instance setup? If so, what sort of task are they.

  2. The NC docs, mentions that some tasks should be run at different times. I might be missing something here, but if running the cron.php file with cron will not all the tasks in the script be executed at the same time?

  1. Using Docker it seems the preferred way to running Cron is in an additional container. I do not fully understand the separation or interaction between the main NC container and the NC cron container. Is the cron container where I would define the scheduled cron job which would trigger the cron.php script located in the main NC container? Why does this container have to be built from and NC image if it only running crontab?

  2. The suggested docker-compose file for the NC cron container contains this configuration, entrypoint: /cron.sh. Iā€™m not sure what this line actually do. Will it execute a cron.sh file on container startup? Where is this file and what does it do?

  3. How do I find out when I need to- and how to- extend cron.php with additional task? Are there code snippets documented somewhere to extend to the bacground jobs scrip for certain NC addons etc?

I donā€™t have detailed answers for all the questions. just see a working solution from my setup:

  • additional cron container has same settings as app container (image, ENV, volumes, DB connection)
  • the only difference is the entrypoint, this is entrypoint: /cron.sh

this results in cron.php running with access to files and DB so it can perform background tasks of the instance

docker-compose example:

  dev-nextcloud-cron:
    image: nextcloud:23.0.1
    container_name: dev-nextcloud-cron
    restart: unless-stopped
    env_file:
      - ./nextcloud.env
      - ./db.env
    volumes:
      - ./app:/var/www/html
      - ./files:/var/www/html/data 
      - ./config:/var/www/html/config
    entrypoint: /cron.sh
    depends_on:
      - dev-nextcloud-app

Thanks for the input @wwe. The docker-compose you provided seems to be similar to the example found under the official repo. Creating this container is no issue, however, it is no use to me if I do not understand how the cron jobs work in NC and how to use/add/modify them. Hence my questions which I hope someone can elaborate on.

  • take a look at cron.sh
  • it starts busybox crond
  • which in turn executes crontab of user www-data /var/spool/cron/crontabs/www-data which means cron.php runs every 5 minutes
*/5 * * * * php -f /var/www/html/cron.php

Thanks for the clarification.

Still not sure why the Cron container needs to be a secondary NextCloud container though. As I understand it from what has been described the container only needs access to the same directory as as the main NextCloud container as well as being able execute a crontab. A job that could be done by any basic Linux container.

Regarding the directories; I have planed to mount additional volumes for data, config, your theme and custom apps. Something like this:

volumes: 
  - server-data:/var/www/html 
  - custom_apps:/var/www/html/custom_apps 
  - config:/var/www/html/config 
  - user-data:/var/www/html/data 
  - themes:/var/www/html/themes

Do I need to make all the same locations available to the Cron container or is it enough to mount the server- and/or user-data directories?

NO!

The cron container needs access to exactly same resources the app container has: files, database and config. as files within Nextclodu are not only the objects in the file system but they belong together with other metadata regarding this object in the DB - e.g. file versions, trashbin, sharing links etc. in other words it must be like a shadow of the originalā€¦

Iā€™m sorry I really donā€™t get your problem - the other container doesnā€™t cost you anything, no more storage no more resources (compared to another container which costs additional storage).

Ah, so that means the Cron container is not only a ā€œtriggerā€ it is actually executing the background tasks within that container. This has not been clear to me.

Oh, Iā€™m sure it doesnā€™t, that is not my concern. I am just trying to learn and increase my understanding of what is going on in the whole NextCloud / Docker setup and which container is doing what and the how the interaction between them is working. Sorry for not being clear.

Why would it cost additional storage if you expose the same mountpoints to that container as the main NextCloud container is using? (again, this is just for my understanding).

1 Like

the mountpoints cost you only as many storage as they contain data, this is exactly the same if you share with one or multiple containersā€¦ different containers cost you additional storageā€¦ in other words running 10 containers from one image cost you 1ximage size (plus runtime data like logs/temp etc). running 2 containers from 2 images costs you the sum of two image sizes.

I am very much in a similar situation, making my first steps with Docker. Only difference: I had already set up the Nextcloud container and now try to add cron (cf this post). Carefully reading the conversation above (thank you @norsemangrey and @wwe), I am still not clear on some aspects.

I understand that the Cron container should have exactly the same resources as the app container. But then:

  1. What is the benefit of running it in a separate container; why not run cron in the ā€˜mainā€™ (App) container?
  2. I suppose in your case the db.env contains the details of the db container (host, user, pw). But if Cron really needs access to the NC database, why does the Cron container in the example compose.yml only have the volume nextcloud:/var/www/html, and not the volume db:/var/lib/mysql (as specified for the Db container) or the (env) details necessary to access the database server?

Your setup @wwe has both the ā€˜parentā€™ directory /var/www/html and the 3 child directories as volumes. So I suppose that works. However, for my App container I didnā€™t define the parent directory, only the child directory. My Nextcloud instance works fine (can stop/restart without problem). Therefore:

  1. Might the cron container also need access to other files & folders under / var/www/html, besides data, config and custom_apps? And if so, why?
    • Specifically: given that cron.sh loads www-data's cron tab, which in turn initiates cron.php every 5 minutes; shouldnā€™t www-data's cron tab be persistent to allow apps to add other/new cron jobs? (From your set-up and the example compose.yml this seems not to be the case, but I find it a bit surprising.)
    • Can I make do without creating a volume for the parent directory?
  1. How does Dockerā€™s mounting process work in my situation, where I donā€™t have a volume specified for the parent folder? Am I correct to assume that first the volumes are mounted in the virtual file system, and that any gaps that exist vis-Ć -vis the image (missing folders/files) are filled by creating them in the virtual file system, using the image as source?

I made several attempts but it didnā€™t seem to work. Many thanks to anyone & everyone helping us in our attempt at better understanding what is going on exactly under the hood :slight_smile:

(@norsemangrey, what did you end up doing, and does it work?)

the philosophy of the docker is each container does exactly one thing.

the cron container does not access database files stored on the disk - it rather connects to the DB server through the network (exactly the same way application does)

if your donā€™t define persistent files changes happen within container instance (review difference to the image) -this can result in different files within specific container instance (e.g. you install new application which only exist within app container, but not within cron) I donā€™t expect issues, give it a try. otherwise it makes no difference from the admin point of view if you persist the folder and share themā€¦

only in case you manipulate the crontab, otherwise I expect it to be managed by the maintainers of the image. yes you can persist a single file as well

1 Like

Many thanks for the replies.

How might it connect to the db server? In the example file it doesnā€™t have the same information as the application; the example file doesnā€™t give the db container credentials to access the db.

I gave it a try and cron doesnā€™t seem to work - at least I still have the warning in the Nextcloud admin panel.

This also worries me:

$ ls /var/www/html
config	custom_apps  data

That folder in the App container has many more folders and files. Despite the NC image being loaded, why do I here only have my mounted volumes?

$ systemctl status cron
sh: 6: systemctl: not found
$ crontab -e
sh: 7: crontab: not found
$ service crond status
crond: unrecognized service
$ grep "cron.php" /var/log/cron
grep: /var/log/cron: No such file or directory

None of methods I could find on the internet to check whether cron is active actually worked in the cron container (probably because the image is stripped down quite a bit).

If you didnā€™t provide DB autoconfiguration variables, most likely you performed the DB config manually and need to share the config.php file between the app and cron container (you need to share it in any case)

The mechanics of the docker cron I described earlier:

I have not yet implemented anything on my live server (too much to do :sweat_smile:), but I have a setup on a test environment where I am currently running Cron from the host, which was very easy to set up.

*/5 * * * * docker exec -t -u www-data nextcloud php -f /var/www/html/cron.php

I have read some post about people having issues with the Background job status in NextCloud not being updated when running it this way, but it seems to be working for me at least.

I am going to try next setting up an additional container to run the Cron jobs, but honestly Iā€™m not sure if it is worth the trouble.

1 Like

@norsemangrey I actually managed to get it set up, wasnā€™t too complicated after all.

While I had volumes for /var/www/html/config, /var/www/html/custom_apps and /var/www/html/data, the parent folder was missing. Kind folks over at GitHub said that I should add that folder as a volume, and share it with the app container.

Apparently thereā€™s some other files (and/or the whole NC installation - didnā€™t really understand that) which are used by the cron container. As I hadnā€™t added the parent folder as volume for some reason the parent folder was empty (which I still donā€™t quite understand - I would expect it to be ā€˜filledā€™ from image when launching the container). Because the parent folder is empty, cron couldnā€™t run, basically.

After adding the parent folder /var/www/html both to the app and the cron container it worked. This is my cron container as in docker-compose.yml:

  cron:
    image: nextcloud
    restart: always
    volumes:
      - nextcloud:/var/www/html
      - ./config:/var/www/html/config
      - ./apps:/var/www/html/custom_apps
      - /nextcloud-data:/var/www/html/data
    entrypoint: /cron.sh
    depends_on:
      - db
      - redis

Happy to (see if I can) help if you want to give it another shot :slight_smile:

1 Like

As Iā€™ve just mentioned in another thread comment, Iā€™m working on setting this for my own NextCloud installation so Iā€™ve been reading quite a bit on the topic, I donā€™t work for the dev team nor have yet time to thoroughly review the source code, but Iā€™ve made a few guided guesses on the go so @norsemangrey Iā€™ll share my thoughts for your proper original questions

cron.php has nothing really defined, it rather reads list of background jobs and execute those.

I guess cron.php acts as the internal task manager for NextCloud itself, so it take cares of all these details that are defined internally and are agnostic of where / how the application is hosted. This is why you can have different ways to trigger this (AJAX, WebCron, Cron). Basically you need somehow to invoke this task manager and then it will take care of its own. Itā€™s a bit confusing with cron as you need to schedule something (the cron.php invocation) that then handles its own internal scheduling mechanism.

Iā€™m unclear what youā€™re trying to understand hereā€¦ @wwe has already pointed to you some interest things to take into account, but basically my simplified version is: you need to schedule internal executions of NextCloud tasks, so trying it from an image without the application or a container with the same setup will not provide you with the intended behaviour. This is why you see many implementations of this using the hostā€™s cron and a docker run <the-name-of-my-nextcloud-container> (which, as it will run on the container where the app is being run, doesnā€™t need additional setup).

As youā€™ve been replied by @wwe, it will setup the triggering of the cron.php.

I think you never have to do such a thing, unless youā€™re developing your own NextCloud app which will require some backgroud job to be executed, in which case NextCould API will allow to advertise those.

I guess youā€™re trying to setup something like Preview Generator (thatā€™s my case) which tells you to cron a special command. Thing is that has little to do as part of the background jobs. Maybe that command could have been implemented to be one of the background jobs of NextCloud, but for some reason (itā€™s risky considering when backround jobs work with AJAX, itā€™s not feasible due to restrictions on the NextCloud account privileges, the developer didnā€™t want toā€¦ whatever) is not, therefore you need something external to provide the ability to run that on regular basis.

For my scenario, as Iā€™m running this as stack with Portainer (meaning using docker compose), Iā€™m going to try setting up an additional service for the cron and setting on it different things Iā€™d like to have scheduled including the very own NextCloud background jobs (aka: cron.php). Will let you know how does work (once it finishes processing the initial preview generation for my 60GB of files :sweat:)

1 Like