AI task pickup (systemd service) not surviving reboots

Steps to reproduce

  1. Setup Nextcloud-AIO including Assistant & the task-pickup services according to this page How to improve AI task pickup speed in AIO · nextcloud/all-in-one · Discussion #5430 · GitHub
  2. Check that the “Chat with AI” feature is responding instantly (within some seconds)
  3. Reboot the server (Nextcloud-AIO “Stop containers” & “Start containers” is not enough)

Expected behavior

Chat with AI is still instant.

Actual behavior

Chat with AI need 5 minutes, untill the result is shown.

Other information

Host OS

Rockstor 5.1.0 built on openSUSE Leap 15.6

Output of sudo docker info

Client:
 Version:    28.5.1-ce
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  0.29.0
    Path:     /usr/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.22.0
    Path:     /usr/lib/docker/cli-plugins/docker-compose

Server:
 Containers: 63
  Running: 61
  Paused: 0
  Stopped: 2
 Images: 91
 Server Version: 28.5.1-ce
 Storage Driver: overlay2
  Backing Filesystem: btrfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 oci runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 442cb34bda9a6a0fed82a2ca7cade05c5c749582
 runc version: v1.3.4-0-gd6d73eb8c602
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.4.0-150600.23.81-default
 Operating System: openSUSE Leap 15.6
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 62.17GiB
 Name: Elster
 ID: 09a22986-8132-426e-8e14-f6be898bd0e1
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.16.0.0/12, Size: 20
   Base: 192.168.0.0/16, Size: 24

Docker run command or docker-compose file that you used

# âš âš âš  This file is intentionally named `docker-compose.yaml` âš âš âš 
# PURPOSE: to exclude this container from my update script
# ISSUE: when this container is used with docker compose down / up, there are network issues preventing the backup from beeing successfull
# SOLUTION: exclude this compose file from my backup script - the AIO container will do all updating by itself.

services:
  nextcloud:
    image: nextcloud/all-in-one:latest
    restart: unless-stopped
    container_name: nextcloud-aio-mastercontainer # This line is not allowed to be changed as otherwise AIO will not work correctly
    volumes:
      - nextcloud_aio_mastercontainer:/mnt/docker-aio-config # This line is not allowed to be changed as otherwise the built-in backup solution will not work
      - /var/run/docker.sock:/var/run/docker.sock:ro # May be changed on macOS, Windows or docker rootless. See the applicable documentation. If adjusting, don't forget to also set 'WATCHTOWER_DOCKER_SOCKET_PATH'!
    ports:
      - 8080:8080
    environment:
      - APACHE_PORT=11000 # Is needed when running behind a web server or reverse proxy (like Apache, Nginx and else). See https://github.com/nextcloud/all-in-one/blob/main/reverse-proxy.md
      - APACHE_IP_BINDING=0.0.0.0 # Should be set when running behind a web server or reverse proxy (like Apache, Nginx and else) that is running on the same host. See https://github.com/nextcloud/all-in-one/blob/main/reverse-proxy.md

      # - SKIP_DOMAIN_VALIDATION=true # use for setup behind reverse proxy
      - NEXTCLOUD_DATADIR=/srv/nextcloud/data # Allows to set the host directory for Nextcloud's datadir. See https://github.com/nextcloud/all-in-one#how-to-change-the-default-location-of-nextclouds-datadir

      ### Integraded BorgBackup
      # - AIO_DISABLE_BACKUP_SECTION=false # Setting this to true allows to hide the backup section in the AIO interface.
      - BORG_RETENTION_POLICY=--keep-within=7d --keep-weekly=4 --keep-monthly=6 # Allows to adjust borgs retention policy. See https://github.com/nextcloud/all-in-one#how-to-adjust-borgs-retention-policy

      ### Misc
      # - COLLABORA_SECCOMP_DISABLED=false # Setting this to true allows to disable Collabora's Seccomp feature. See https://github.com/nextcloud/all-in-one#how-to-disable-collaboras-seccomp-feature
      # - TALK_PORT=3478 # This allows to adjust the port that the talk container is using.

      ### Nextcloud limits
      # - NEXTCLOUD_MOUNT=/mnt/ # Allows the Nextcloud container to access the chosen directory on the host. See https://github.com/nextcloud/all-in-one#how-to-allow-the-nextcloud-container-to-access-directories-on-the-host
      # - NEXTCLOUD_UPLOAD_LIMIT=10G # Can be adjusted if you need more. See https://github.com/nextcloud/all-in-one#how-to-adjust-the-upload-limit-for-nextcloud
      # - NEXTCLOUD_MAX_TIME=3600 # Can be adjusted if you need more. See https://github.com/nextcloud/all-in-one#how-to-adjust-the-max-execution-time-for-nextcloud
      # - NEXTCLOUD_MEMORY_LIMIT=512M # Can be adjusted if you need more. See https://github.com/nextcloud/all-in-one#how-to-adjust-the-php-memory-limit-for-nextcloud
      
volumes:
  nextcloud_aio_mastercontainer:
    name: nextcloud_aio_mastercontainer # This line is not allowed to be changed as otherwise the built-in backup solution will not work    

Output of sudo docker logs nextcloud-aio-mastercontainer

[22-Dec-2025 14:19:43] NOTICE: Terminating ...
[22-Dec-2025 14:19:43] NOTICE: exiting, bye-bye!
[Mon Dec 22 14:19:44.961059 2025] [mpm_event:notice] [pid 132:tid 132] AH00491: caught SIGTERM, shutting down
Initial startup of Nextcloud All-in-One complete!
You should be able to open the Nextcloud AIO Interface now on port 8080 of this server!
E.g. https://internal.ip.of.this.server:8080
âš  Important: do always use an ip-address if you access this port and not a domain as HSTS might block access to it later!

If your server has port 80 and 8443 open and you point a domain to your server, you can get a valid certificate automatically by opening the Nextcloud AIO Interface via:
https://your-domain-that-points-to-this-server.tld:8443
/usr/lib/python3.12/site-packages/supervisor/options.py:13: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
++ head -1 /mnt/docker-aio-config/data/daily_backup_time
+ BACKUP_TIME=01:35
+ export BACKUP_TIME
+ export DAILY_BACKUP=1
+ DAILY_BACKUP=1
++ sed -n 2p /mnt/docker-aio-config/data/daily_backup_time
+ '[' '' '!=' automaticUpdatesAreNotEnabled ']'
+ export AUTOMATIC_UPDATES=1
+ AUTOMATIC_UPDATES=1
++ sed -n 3p /mnt/docker-aio-config/data/daily_backup_time
+ '[' successNotificationsAreNotEnabled '!=' successNotificationsAreNotEnabled ']'
+ export SEND_SUCCESS_NOTIFICATIONS=0
+ SEND_SUCCESS_NOTIFICATIONS=0
+ set +x
[22-Dec-2025 14:22:05] NOTICE: fpm is running, pid 145
[22-Dec-2025 14:22:05] NOTICE: ready to handle connections
{"level":"info","ts":1766413326.0711904,"msg":"maxprocs: Leaving GOMAXPROCS=12: CPU quota undefined"}
{"level":"info","ts":1766413326.0717726,"msg":"GOMEMLIMIT is updated","package":"github.com/KimMachineGun/automemlimit/memlimit","GOMEMLIMIT":60077124403,"previous":9223372036854775807}
{"level":"info","ts":1766413326.0720148,"msg":"using config from file","file":"/Caddyfile"}
{"level":"info","ts":1766413326.1001832,"msg":"adapted config to JSON","adapter":"caddyfile"}
{"level":"info","ts":1766413326.127204,"msg":"serving initial configuration"}
[Mon Dec 22 14:22:06.324366 2025] [mpm_event:notice] [pid 139:tid 139] AH00489: Apache/2.4.66 (Unix) OpenSSL/3.5.4 configured -- resuming normal operations
[Mon Dec 22 14:22:06.324678 2025] [core:notice] [pid 139:tid 139] AH00094: Command line: 'httpd -D FOREGROUND'

Other valuable info

After reboot, the systemd services are still running:

$ systemctl list-units --type=service | grep nextcloud-ai-worker
  nextcloud-ai-worker@1.service                                                       loaded active running Nextcloud AI worker 1
  nextcloud-ai-worker@2.service                                                       loaded active running Nextcloud AI worker 2
  nextcloud-ai-worker@3.service                                                       loaded active running Nextcloud AI worker 3
  nextcloud-ai-worker@4.service                                                       loaded active running Nextcloud AI worker 4

$ systemctl status nextcloud-ai-worker@1.service
â—Ź nextcloud-ai-worker@1.service - Nextcloud AI worker 1
     Loaded: loaded (/etc/systemd/system/nextcloud-ai-worker@.service; enabled; preset: disabled)
     Active: active (running) since Mon 2025-12-22 15:21:40 CET; 13min ago
   Main PID: 1072 (taskprocessing.)
      Tasks: 13 (limit: 4915)
        CPU: 44ms
     CGroup: /system.slice/system-nextcloud\x2dai\x2dworker.slice/nextcloud-ai-worker@1.service
             ├─1072 /bin/bash /opt/nextcloud-ai-worker/taskprocessing.sh 1
             ├─1076 docker ps --format {{.Names}}
             └─1079 grep -q "^nextcloud-aio-nextcloud\$"

Dec 22 15:21:40 Elster systemd[1]: Started Nextcloud AI worker 1.
Dec 22 15:21:40 Elster taskprocessing.sh[1072]: Starting Nextcloud AI Worker 1

But the Chat with AI takes minutes to response

Image

Only when the systemd services are restarted, the AI response is “instant” again

$ for i in {1..4}; do sudo systemctl restart nextcloud-ai-worker@$i.service; done

$ systemctl list-units --type=service | grep nextcloud-ai-worker
  nextcloud-ai-worker@1.service                                                       loaded active running Nextcloud AI worker 1
  nextcloud-ai-worker@2.service                                                       loaded active running Nextcloud AI worker 2
  nextcloud-ai-worker@3.service                                                       loaded active running Nextcloud AI worker 3
  nextcloud-ai-worker@4.service                                                       loaded active running Nextcloud AI worker

$ systemctl status nextcloud-ai-worker@1.service
â—Ź nextcloud-ai-worker@1.service - Nextcloud AI worker 1
     Loaded: loaded (/etc/systemd/system/nextcloud-ai-worker@.service; enabled; preset: disabled)
     Active: active (running) since Mon 2025-12-22 15:37:41 CET; 19s ago
   Main PID: 26329 (taskprocessing.)
      Tasks: 11 (limit: 4915)
        CPU: 39ms
     CGroup: /system.slice/system-nextcloud\x2dai\x2dworker.slice/nextcloud-ai-worker@1.service
             ├─26329 /bin/bash /opt/nextcloud-ai-worker/taskprocessing.sh 1
             └─26345 docker exec --user www-data nextcloud-aio-nextcloud php occ background-job:worker -t 60 "OC\\TaskProcessing\\SynchronousBackgroundJob"

Dec 22 15:37:41 Elster systemd[1]: Started Nextcloud AI worker 1.
Dec 22 15:37:41 Elster taskprocessing.sh[26329]: Starting Nextcloud AI Worker 1
Dec 22 15:37:41 Elster taskprocessing.sh[26329]: Attempting to start Nextcloud AI Worker ...
Dec 22 15:37:42 Elster taskprocessing.sh[26345]: Background job worker will stop after 60 seconds

Reference

I posted it originally as issue on Nextcloud-AIOs github page and was advised to post about my issue here.

I seem to have the same problem. I also don’t get the solution. I’m running AIO, so persistently changing systemd services does not seem like an option. But also: is this not just a bug then? I talk to an assistant, that should not be a background task running every so many minutes, it should just reply immediately.

The “After reboot” output suggests the script is waiting for the nextcloud-aio-nextcloud container to come up. The script has a loop to keep checking for it.

If this hang is happening only after reboots, I suspect it’s because the service is trying to start before the docker.service is ready. The way the script is written, if it runs under that state, it could hang indefinitely.

In the service file, add this in the [Unit] section then try rebooting and see if the problem goes away:

[Unit]
# Ensures the service waits until Docker is fully functional
After=docker.service
Requires=docker.service

In addition, possibly a delay will need to be introduced in [Service]:

[Service]
# Adds a delay to let the daemon stabilize if needed
ExecStartPre=/bin/sleep 5
2 Likes

Thank you very much @jtr for finding the issue and explaining it :folded_hands: :blush:

I can indeed confirm, that waiting for docker in the systemd unit file did resolve the issue.

Also, @szaimen has already updated the description for Nextcloud-AIO

Just to be explicit, here is the full systemd unit file:

[Unit]
Description=Nextcloud AI worker %i
After=network.target
After=docker.service
Requires=docker.service

[Service]
ExecStart=/opt/nextcloud-ai-worker/taskprocessing.sh %i
Restart=always
StartLimitInterval=60
StartLimitBurst=10

[Install]
WantedBy=multi-user.target
1 Like

This topic was automatically closed 8 days after the last reply. New replies are no longer allowed.