Upgrade locking, nextcloud-init-sync.lock, "Another process is initializing..." - ephemeral vs persistent lock

I run Nextcloud in a Docker container. I use watchtower to monitor for new images and auto-upgrade the container. Today at midnight it found 24.0.1.1 and attempted to upgrade from 24.0.0.12. It got stuck so Nextcloud was down. Here’s a log snippet:

app_1            | Conf remoteip disabled.
app_1            | To activate the new configuration, you need to run:
app_1            |   service apache2 reload
app_1            | Configuring Redis as session handler
app_1            | Initializing nextcloud 24.0.1.1 ...
app_1            | Upgrading nextcloud from 24.0.0.12 ...
app_1            | Another process is initializing Nextcloud. Waiting 10 seconds...
app_1            | Another process is initializing Nextcloud. Waiting 20 seconds...
app_1            | Another process is initializing Nextcloud. Waiting 30 seconds...
app_1            | Another process is initializing Nextcloud. Waiting 40 seconds...
app_1            | Another process is initializing Nextcloud. Waiting 50 seconds...
app_1            | Another process is initializing Nextcloud. Waiting 60 seconds...

I found a workaround here: [Bug]: Upgrade 23.0.3 to 23.0.4 docker Server does not migrate · Issue #1742 · nextcloud/docker · GitHub

So, yay! Manually deleting nextcloud-init-sync.lock and restarting the container worked: it was able to resume and complete the upgrade.

My question is about the lock itself. nextcloud-init-sync.lock seems like a “persistent lock”, meaning, the presence of that file is used to indicate something else is in-progress. I’m guessing the upgrade somehow stopped halfway (maybe a download timed out? I don’t know).

Is it possible to use an “ephemeral lock” instead, such as a call to PHP’s flock() function? Maybe that would be a more robust way to lock during an upgrade.

I searched around the nextcloud/server code a bit and couldn’t find the code responsible for creating and checking this file.

3 Likes

Ah, no wonder I couldn’t find it in the server code, this is done in Docker-related code:
https://github.com/nextcloud/docker/blob/master/docker-entrypoint.sh

Maybe this will prevent the same failure in the future: use ephemeral instead of manual/persistent lock during upgrade · Issue #1756 · nextcloud/docker · GitHub

1 Like