NC docker image in GitHub Actions not directly reachable

stefan-niedermann · October 28, 2021, 7:50pm

Hello there

I have not much experience with docker at all, but i recently got a use case that would be nicely solvable with the Nextcloud Docker image.

As a proof of concept i tried to set up the nextcloud:latest image via GitHub Actions and then perform a curl to validate the service being present and running.

Unfortunately the service seems to be not reachable directly, though some kind of “health check” happens according to the logs.

Initial draft (full log):

...
2021-10-28T19:24:31.2060340Z ##[group]Waiting for all services to be ready 
2021-10-28T19:24:31.2070399Z ##[command]/usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" 
...
2021-10-28T19:24:31.3793056Z ##[group]Run curl -v -X GET '***localhost:8080/ocs/v2.php/cloud/capabilities?format=json' -H 'OCS-APIRequest: true' | jq
...
2021-10-28T19:24:31.4868404Z * Recv failure: Connection reset by peer

If i add a sleep 10s before performing the curl request, everything works as expected.

After adding a 10 seconds timeout (full log):

...
2021-10-28T19:25:00.2658709Z ##[group]Waiting for all services to be ready 
2021-10-28T19:25:00.2667342Z ##[command]/usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" 
...
2021-10-28T19:25:00.4272714Z ##[group]Run sleep 10s
...
2021-10-28T19:25:10.4842066Z ##[group]Run curl -v -X GET '***localhost:8080/ocs/v2.php/cloud/capabilities?format=json' -H 'OCS-APIRequest: true' | jq
...
2021-10-28T19:25:11.3002955Z < HTTP/1.1 200 OK
...
2021-10-28T19:25:11.3068824Z { 
2021-10-28T19:25:11.3069250Z "ocs": { 
2021-10-28T19:25:11.3069695Z "meta": { 
2021-10-28T19:25:11.3070182Z "status": "ok", 
2021-10-28T19:25:11.3070717Z "statuscode": 200, 
2021-10-28T19:25:11.3071247Z "message": "OK" 
2021-10-28T19:25:11.3071726Z },
...

This might be related to this issue but i am not sure.

The problem is, that 10 second timeouts are not deterministic, it might work or not in this timespan, depending on the free resources. That’s why i currently think about implementing a polling mechanism that checks every second whether the instance is running and only then continues the other steps.

I am wondering if i understood something wrong here, does the health check not validate the Nextcloud instance being running?

Any help is appreciated, i can offer free vouchers for Nextcloud Deck Android and Nextcloud Notes Android at the Play Store as bounty

Update: This is the complete workflow file i used to generate the logs above:

jobs:
  setup_nextcloud:
    runs-on: ubuntu-latest
    name: Setup Nextcloud
    services:
      nextcloud:
        image: nextcloud:latest
        env:
          SQLITE_DATABASE: db.sqlite
          NEXTCLOUD_ADMIN_USER: Test
          NEXTCLOUD_ADMIN_PASSWORD: Test
        ports:
           - 8080:80
    steps:
#      - name: Wait for Nextcloud instance
#        run: sleep 10s
      - name: Fetch capabilities
        continue-on-error: true
        run: |
          curl -v -X GET 'http://Test:Test@localhost:8080/ocs/v2.php/cloud/capabilities?format=json' -H 'OCS-APIRequest: true' | jq

stefan-niedermann · October 30, 2021, 3:14pm

My workaround is now to use the docker health-cmd option. I won’t mark this as accepted solution on purpose, because this is a workaround and actually not the solution would imagine… So further ideas are welcome

jobs:
  setup_nextcloud:
    runs-on: ubuntu-latest
    name: Run e2e test
    services:
      nextcloud:
        image: nextcloud:latest
        env:
          SQLITE_DATABASE: db.sqlite
          NEXTCLOUD_ADMIN_USER: Test
          NEXTCLOUD_ADMIN_PASSWORD: Test
        ports:
          - 8080:80
        options: >-
          --health-cmd "curl GET 'http://Test:Test@localhost:80/ocs/v2.php/apps/serverinfo/api/v1/info' -f -H 'OCS-APIRequest: true' || exit 1"
          --health-interval 1s
          --health-timeout 2s
          --health-retries 10
          --health-start-period 3s