Bad Gateway for Collabora with Nextcloud deployed via docker-compose

I am trying to deploy Nextcloud using docker-compose, with Traefik acting as the reverse proxy and automatic LetsEncrypt TLS certificate installer. So far, the Nextcloud server itself and the certificate generation are all working smoothly. However, I am unable to get a Collabora server online. The docker-compose config file below brings up both Nextcloud and Collabora servers, and I can even visit the Collabora server URL at https://collabora.ups.example.com and see a valid TLS cert in my browser. Unfortunately the body of the page is “Bad Gateway”, which usually means something is not working at the reverse proxy level for connecting ports, but I cannot figure out how to correct the configuration. Any help would be appreciated.

version: '3'

networks:
  web:
    external: true
  internal:
    external: false

services:
  reverse-proxy:
    image: traefik:v2.0
    container_name: traefik
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /opt/traefik/traefik.yaml:/etc/traefik/traefik.yaml
      - /opt/traefik/config.yaml:/etc/traefik/config.yaml
      - /opt/traefik/acme.json:/etc/traefik/acme.json
    networks:
      - web
  cloud_ups_example_com_db:
    image: mariadb
    container_name: cloud_ups_example_com_db 
    command: --transaction-isolation=READ-COMMITTED --binlog-format=ROW
    restart: always
    volumes:
      - /opt/upscloud.example.com/db:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=57gCFuR4fTNCsQps
      - MYSQL_PASSWORD=J34NnXeUNdoXWgLp
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud
    networks:
      - internal 
  cloud_ups_example_com_app:
    image: nextcloud
    container_name: cloud_ups_example_com_app
    restart: always
    environment:
      - VIRTUAL_HOST=upscloud.example.com
      - MYSQL_HOST=cloud_ups_example_com_db
    depends_on:
      - cloud_ups_example_com_db
    volumes:
      - /opt/upscloud.example.com/www:/var/www/html
    ports:
      - "80"
    networks:
      - web
      - internal 
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=web"
      - "traefik.http.routers.cloud_ups_example_com_app-http.rule=Host(`upscloud.example.com`)"
      - "traefik.http.routers.cloud_ups_example_com_app-http.entrypoints=http"
      - "traefik.http.routers.cloud_ups_example_com_app-http.middlewares=default-http@file"
      - "traefik.http.routers.cloud_ups_example_com_app.entrypoints=https"
      - "traefik.http.routers.cloud_ups_example_com_app.rule=Host(`upscloud.example.com`)"
      - "traefik.http.routers.cloud_ups_example_com_app.tls=true"
      - "traefik.http.routers.cloud_ups_example_com_app.tls.certresolver=letsencrypt"
  ups_collabora:
    image: collabora/code
    container_name: ups_collabora
    networks:
      - web
    cap_add:
      - MKNOD
    expose:
      - "9980"
    environment:
      - domain=upscloud.example.com
      - username=admin
      - password=password
      - VIRTUAL_PROTO=http
      - VIRTUAL_PORT=9980
      - VIRTUAL_HOST=collabora.ups.example.com
    restart: unless-stopped 
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=web"
      - "traefik.http.routers.ups_collabora-http.rule=Host(`collabora.ups.example.com`)"
      - "traefik.http.routers.ups_collabora-http.entrypoints=http"
      - "traefik.http.routers.ups_collabora-http.middlewares=default-http@file"
      - "traefik.http.routers.ups_collabora.entrypoints=https"
      - "traefik.http.routers.ups_collabora.rule=Host(`collabora.ups.example.com`)"
      - "traefik.http.routers.ups_collabora.tls=true"
      - "traefik.http.routers.ups_collabora.tls.certresolver=letsencrypt"


hope this ansible playbook is helpful for you. ansible docker task are similar to docker compose. so you should be able read them.

i see that you expose port 9980. that’s not necessary. you configure your nextcloud webserver (in my case nginx) as reverse proxy for collabora.

Thanks for the ideas @Reiner_Nippes but I think this is more of a Traeifk-specific problem. In my case, Traefik acts as the reverse proxy for containerized services like Nextcloud. For example, port 80 on the Nextcloud instance in my docker-compose file shown is not directly accessed from the outside; instead, Traefik listens on 80 and 443 and forwards to the Nextcloud container port 80. The same should be happening for the Collabora service, but tunneling 443 to 9980. I can’t tell where the disconnect is happening.

@andrew where do you define that traefik should route traffic to port 9980 of the collabora container and why to you expose port 9980 to the world?

The expose key in the docker-compose.yml file only exposes the port to linked services. It is not exposed to the world. This is the kind of thing that makes this stuff so confusing, right? In this case the linked service is Traefik via network web. By default, Traefik uses the first exposed port of a container to make a route from port 80/443 to the service using those “router” definitions in the labels.

I am suspicious that for some reason the Collabora instance is not listening on port 9980. First I should start this with docker run explicitly mapping 9980 to host port 9980 and make sure the container is running properly without any complications with docker-compose and Traefik.

I verified that the Collabora server is online and working when Traefik is not involved (docker-compose config file labels commented out), exposing container port 9980 to host port 9980 and specifying the :9980 suffix in the server URL in the settings page /settings/admin/richdocuments. A self-signed certificate was used for this test. So the bad gateway is very likely due to a misconfigured Traefik v2.0 route.

1 Like

Slowly circling in on the problem here. My current configuration (below) should be correct for Collabora itself. I can see the XML output at https://collabora.example.com/hosting/discovery. In the Nextcloud settings the URL is set to https://collabora.example.com. The problem seems to be that the Nextcloud instance thinks it is http instead of https, and so I get an error when I save the Collabora URL due to this “mismatch”. Traefik is ensuring that all traffic to the Nextcloud and Collabora instances is TLS encrypted, but those servers themselves are operating in http mode. When I try to load a document, the Nextcloud page is attempting to load http://collabora.example.com/ resources (like javascript files). I need to find a way to get these URLs to start with https.

collabora:
    image: collabora/code
    container_name: collabora_app
    networks:
      - web
      - internal
    cap_add:
      - MKNOD
    expose:
      - "9980"
    environment:
      - domain=cloud.example.com
      - username=admin
      - password=pass
      - extra_params="--o:ssl.enable=false"
    restart: unless-stopped
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=web"
      - "traefik.http.routers.collabora-http.rule=Host(`collabora.example.com`)"
      - "traefik.http.routers.collabora-http.entrypoints=http"
      - "traefik.http.routers.collabora-http.middlewares=default-http@file"
      - "traefik.http.routers.collabora.entrypoints=https"
      - "traefik.http.routers.collabora.rule=Host(`collabora.example.com`)"
      - "traefik.http.routers.collabora.tls=true"
      - "traefik.http.routers.collabora.tls.certresolver=letsencrypt"

Hello Andrew,

I have the same issue with the same configuration and I not find a solution.
Why do you say that the protocol used is http? For me, in the log of nextcloud I have :

GuzzleHttp\Exception\ServerException: Server error: GET https://collabora.example.com/hosting/discovery resulted in a 502 Bad Gateway response: Bad Gateway

When I use http and the port 9980, I have the xml page of collabora. So I think it’s traefik that returns a 502, but I don’t know why …
Do you have log in traefik ? I don’t see error in traefik…

Can you share your configuration here? Because it doesn’t sound like yours is the exactly the same, because otherwise you would not get the “bad gateway” error. I think that problem is due to TLS termination. When you let Traefik handle all the TLS certificate generation and termination with those magical labels, it means that the container never sees encrypted traffic or any TLS handshake offers. Thus if you don’t disable TLS for your containerized service, it will try to set up a secure connection again after Traefik has already handled that, and you get “bad gateway” errors. This is why my updated config uses extra_params="--o:ssl.enable=false".

I am more confident that the problem is in the Nextcloud configuration as I described earlier. When I open an Open Office document in the web interface, it tries to load the Collabora editor but the browser blocks the loading of the http://collabora.example.com javascript. Somehow I need to get Nextcloud to make that URL begin with https.

Until I learn how to do this properly (certainly someone else is running Collabora in Nextcloud behind Traefik???), I’m using the following hack to get my Collabora instance working with Nextcloud. Since Traefik already obtained the LetsEncrypt TLS certificate, I am using that cert/key pair and letting Collabora handle the TLS termination. Here’s my “working” config:

  collabora:
    image: collabora/code
    container_name: collabora_app
    networks:
      - web
      - internal
    cap_add:
      - MKNOD
    ports:
      - "9980:9980"
    environment:
      - domain=cloud.example.com
      - username=admin
      - password=pass
    volumes:
      - /opt/cloud.example.com/cert.crt:/etc/loolwsd/cert.pem
      - /opt/cloud.example.com/cert.key:/etc/loolwsd/key.pem
    restart: always

Traefik is no longer handling routing for this container. The cert/key files are mounted where Collabora expects them to be. To generate the cert.crt and cert.key files, I manually ran base64 --decode on the relevant strings in the Traefik acme.json file:

      ...
      {
        "domain": {
          "main": "collabora.example.com"
        },
        "certificate": "LS...tCg==",
        "key": "LS0tLS...0K",
        "Store": "default"
      }
      ...

Then I set the Collabora server URL in the Nextcloud admin setting page to be https://collabora.example.com:9980.

The only real flaw here is that (1) I will have to periodically switch back to Traefik control to renew the LetsEncrypt certificate and manually recreate those certificate files, and (2) I have to expose host port 9980 unnecessarily.

there exist cert dumper container for traefik.

1 Like

Andrew : I have this configuration.

version: '3'

services:
  collabora:
    image: collabora/code
  restart: always
  labels:
    - "traefik.enable=true"
    - "traefik.docker.network=traefik_traefik"
    - "traefik.http.routers.collabora.rule=Host(`collabora.example.com`)"
    - "traefik.http.routers.collabora.entrypoints=http"
    - "traefik.http.middlewares.collabora-https-redirect.redirectscheme.scheme=https"
    - "traefik.http.routers.collabora.middlewares=collabora-https-redirect"
    - "traefik.http.routers.collabora-secure.entrypoints=https"
    - "traefik.http.routers.collabora-secure.rule=Host(`collabora.example.com`)"
    - "traefik.http.routers.collabora-secure.tls=true"
    - "traefik.http.routers.collabora-secure.tls.certresolver=default"
    - "traefik.http.services.collabora.loadbalancer.server.port=9980"
  expose:
    - 9980
  cap_add:
    - MKNOD
  ports:
    - 9980:9980
  environment:
    - "extra_params=--o:ssl.enable=false"
    - "domain=nextcloud\\.example\\.com"
    - username=admin
    - password=ilovemypassword
    - VIRTUAL_PROTO=http
    - VIRTUAL_PORT=9980
    - VIRTUAL_HOST=collabora.example.com
  networks:
    - traefik_traefik

@Typo_13 I think what is happening to you here is that you are mapping the Collabora container port 9980 to host port 9980. This must bypass Traefik which explains why you can access the discovery/hosting endpoint when you use non-TLS (because of extra_params=--o:ssl.enable=false) on port 9980. When you do not specify 9980, Traefik is redirecting port 80 requests to port 443 and handling the TLS termination. What I don’t understand is why Traefik does not then correctly direct the traffic to Collabora container port 9980.

Your config looks the same as mine except for that loadbalancer label. If you comment that out and rebuild, are you able to access https://collabora.example.com/hosting/discovery like I was?

@andrew Finally I found a valid configuration. I have delete the loadbalancer label and add --o:ssl.termination=true

collabora:
  image: collabora/code
  restart: always
  labels:
    - "traefik.enable=true"
    - "traefik.docker.network=traefik_traefik"
    - "traefik.http.routers.collabora.rule=Host(`collabora.example.com`)"
    - "traefik.http.routers.collabora.entrypoints=http"
    - "traefik.http.middlewares.collabora-https-redirect.redirectscheme.scheme=https"
    - "traefik.http.routers.collabora.middlewares=collabora-https-redirect"
    - "traefik.http.routers.collabora-secure.entrypoints=https"
    - "traefik.http.routers.collabora-secure.rule=Host(`collabora.example.comr`)"
    - "traefik.http.routers.collabora-secure.tls=true"
    - "traefik.http.routers.collabora-secure.tls.certresolver=default"
  expose:
    - 9980
  cap_add:
    - MKNOD
  environment:
    - "extra_params=--o:ssl.enable=false  --o:ssl.termination=true"
    - domain=nextcloud.example.com
    - username=admin
    - password=pass
  networks:
    - traefik_traefik

Can you try with this configuration ?

I’m glad you got yours working. In fact I tried that same thing (without mentioning it earlier), but what I get in my Collabora logs is

$ docker logs collabora_app -f
...
wsd-00034-00040 2020-01-16 20:23:51.837183 [ websrv_poll ] ERR  Socket #20 SSL BIO error: error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request (0: Success)| ./net/SslSocket.hpp:291
wsd-00034-00040 2020-01-16 20:23:51.837638 [ websrv_poll ] ERR  Error while handling poll for socket #20 in websrv_poll: error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request| ./net/Socket.hpp:594

By searching for this error I found this discussion: Socket Error when accessing Collabora . For some reason I do not understand, the extra_params environment variable is ignored by my container! By using a volume mount for loolwsd.xml instead of using an environment variable as described in that post, I finally have a working configuration, where Traefik handles TLS certificate renewals and HTTPS and makes the Collabora server accessible at a standard URL https://collabora.example.com.

Thanks for your help @Typo_13 !

For reference for others frustrated by this configuration, here is my collabora docker-compose config:

collabora:
	image: collabora/code
	container_name: collabora_app
	networks:
		- web
	cap_add:
		- MKNOD
	expose:
		 - 9980
	environment:
		- domain=cloud.example.com
		- username=admin
		- password=pass
	volumes:
		- /opt/cloud.example.com/loolwsd.xml:/etc/loolwsd/loolwsd.xml
	restart: always
	labels:
		- "traefik.enable=true"
		- "traefik.docker.network=web"
		- "traefik.http.routers.ups_collabora-http.rule=Host(`collabora.ups.example.com`)"
		- "traefik.http.routers.ups_collabora-http.entrypoints=http"
		- "traefik.http.routers.ups_collabora-http.middlewares=default-http@file"
		- "traefik.http.routers.ups_collabora.entrypoints=https"
		- "traefik.http.routers.ups_collabora.rule=Host(`collabora.ups.example.com`)"
		- "traefik.http.routers.ups_collabora.tls=true"
		- "traefik.http.routers.ups_collabora.tls.certresolver=letsencrypt"

and here are the modified lines in loolwsd.xml (also I had to chown 105:106 loolwsd.xml):

...
    <ssl desc="SSL settings">
        <enable type="bool" desc="Controls whether SSL encryption between browser and loolwsd is enabled (do not disable for production deployment). If default is false, must first be compiled with SSL support to enable." default="true">false</enable>
        <termination desc="Connection via proxy where loolwsd acts as working via https, but actually uses http." type="bool" default="true">true</termination>
...

Note: To get loolwsd.xml I copied the default from the running container before including the volume in the docker-compose config:

docker cp collabora_app:/etc/loolwsd/loolwsd.xml loolwsd.xml
chown 105:106 loolwsd.xml
2 Likes

@Typo_13 I can’t believe the timing of this announcement after going through all of this :man_facepalming:

A bit late to the thread, but @andrew there are at least three people trying to do this, so thanks to you and @Typo_13 as well. I also found that I couldn’t get the extra_params environment variables to pass correctly through the docker compose file, and followed @andrew’s steps to resolve (there is some more detail here: https://www.collaboraoffice.com/code/docker/).

One thing I had to poke around at for a while was getting the loolwsd.xml to have the correct ownership on the host (it wasn’t 105:106 for me). To do that, start the container (with no volumes specified – comment that out), then

sudo docker cp collabora:/etc/loolwsd/loolwsd.xml ./collabora/loolwsd.xml
sudo docker exec -it collabora /bin/bash
ls -al /etc/loolwsd
id -u lool
id -g lool
exit

The last three lines will be executed in a bash shell on the container: check the path where the loolwsd.xml file is supposed to live, check the ID of the user lool (assuming that’s who owns that file – same user and group for me) and exit. For me, the IDs were 101:101 so next I

$ sudo chown 101:101 collabora/loolwsd.xml
$ sudo docker stop collabora

In my case, I made a folder collabora to put the loolwsd.xml file in, so my docker compose file has the line

volumes:
    - "./collabora/loolwsd.xml:/etc/loolwsd/loolwsd.xml"

My full docker compose is

version: "3.7"

services:
    collabora:
        image: "collabora/code"
        container_name: "collabora"
        restart: unless-stopped
        env_file:
            - .env
        environment:
            - "domain=cloud\\.somedomain\\.com"
            - "username=administrator"
            - "password=$COLLABORA_PASSWORD"
        volumes:
            - "./collabora/loolwsd.xml:/etc/loolwsd/loolwsd.xml"
        cap_add:        # https://nextcloud.com/collaboraonline/
            - MKNOD
        expose:
            - "9980"
        labels:
            - "traefik.enable=true"
            - "traefik.http.routers.collabora.entrypoints=websecure"
            - "traefik.http.routers.collabora.rule=Host(`office.$MY_DOMAIN`)"
            - "traefik.http.routers.collabora.tls.certresolver=lets-encr"

networks:
    default:
        external:
            name: $DEFAULT_NETWORK

When I restarted the container, I could check https://office.somedomain.com/hosting/discovery and see that all the URLs were https. Also, https://office.somedomain.com/loleaflet/dist/admin/admin.html should take you to the admin portal where you can put in your password.

Configuring it in NC18, I just put the URL in for the Collabora Online Server (admin > Settings > Collabora Online) as https://office.somedomain.com with no port. Everything else is unchecked. Clicking apply updated the label in the settings to Collabora Online Development Edition after a while. When I signed out and signed in as a user, documents were there!

1 Like