Aio installation adventures

hello,

i try to install a new nextcloud instance on a debian vm in kvm behind an opnsense firewall.

the vm is reachable via one-to-one NAT in opnsense (so it has its own external ip), 80 & 443 is open to the public. itself can access http/https outgoing.
docker is installed via apt & the docker repos.

installation command:

docker run \
--sig-proxy=false \
--name nextcloud-aio-mastercontainer \
--restart always \
--publish 80:80 \
--publish 8080:8080 \
--publish 8443:8443 \
--volume nextcloud_aio_mastercontainer:/mnt/docker-aio-config \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--env NEXTCLOUD_DATADIR="/data/nextcloud" \
--env AIO_DISABLE_BACKUP_SECTION=true \
--env NEXTCLOUD_MEMORY_LIMIT=2048M \
--add-host $FQDN:$EXTERNALIP \
nextcloud/all-in-one:latest

the first thing i stumble upon is that port 8080 - right after installation - is not http at all, its https:

Bad Request

Your browser sent a request that this server could not understand.
Reason: You're speaking plain HTTP to an SSL-enabled server port.
Instead use the HTTPS scheme to access this URL, please.
Apache/2.4.57 (Unix) Server at localhost Port 8080

when i try on port 8443 i get the aio password (or better “got” - because all my tries resulted me running into LEs " Duplicate Certificate Limit" - will again try tomorrow).
i then can proceed to install the instance with $FQDN as domain and it seems to start up but nextcloud and (depending on that) apache will stay “Starting” forever (>1h).

docker logs of nextcloud:

+ '[' -f /dev-dri-group-was-added ']'
++ find /dev -maxdepth 1 -mindepth 1 -name dri
+ '[' -n '' ']'
+ set +x
Installing imagemagick via apk...

while trying to debug the issue i saw on the firewall that the containers wanted to communicate with their own ips but obviously couldnt get anywhere:

	BRproxy		2023-05-16T13:37:40	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:37:13	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:37:00	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:54	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:51	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:49	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:48	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:48	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	
	BRproxy		2023-05-16T13:36:47	172.18.0.4:44190	104.154.207.153:443	tcp	reject anything else	

is this by design ? it seems very strange to me …

has someone an idea what i might do wrong ?

tia,tja…

Indeed, this is e.g. documented in GitHub - nextcloud/all-in-one: The official Nextcloud installation method. Provides easy deployment and maintenance with most features included in this one Nextcloud instance.

The container must be able to talk to itself. So it is normal that it tries to do so. Possibly allowing these requests in your firewall will speed up the initial startup.

hi,

thx for your fast response.

indeed, i misunderstood and overlooked that 8080 is ssl with a self-signed cert. thx for pointing that out. i can reach the aio UI now even without a valid LE-cert.

the second part - container try to talk with its own ip to the net - i should have formulated more carefully.
this line from the firewall log (without all that fancy html etc) …

BRproxy		2023-05-16T13:37:40	172.18.0.4:44190	104.154.207.153:443

tells me that the container wants to reach 104.154.207.153 which is not the external ip of the nextcloud VM but some address of google (153.207.154.104.bc.googleusercontent.com.)
the firewall (this is another machine) blocks this as all of 172.18… is unknown to him.
i guess the traffic should be masqueraded over the nextcloud VMs ip.

just to be clear: the VM itself has no problem at all reaching this and other adresses that 172.18… wanted to reach. also, the vm can reach itself via internal and external ip.

so i’m not sure what you mean by:

The container must be able to talk to itself. So it is normal that it tries to do so. Possibly allowing these requests in your firewall will speed up the initial startup.

do you mean the container must be able to talk to itself inside the containers host (the VM) or via the outside i.e. the internet ?

tia,tja…

Via the outside internet or if you run a local dns server and configured that for your domain via the inside internet, respectively.

I am honestly not sure where this ip-address comes from.

hmm i still think we miss each other - pls bear with me.

1st:
172.18.xxx is a private lan address so the container itself with its 172.18.xxx address could not possibly contact itself via the “internet” - at least not without masquerading on the outbound of the host (=where the containers run) and port-fowarding/DNAT on the inbound of the host (=where the containers run).
surely we could agree on that ?

2nd:
there should be never any instance of traffic from the containers un-masqueraded going outbound from the host - am i right or did i miss the usual workings of docker ?

tia,tja…

another puzzle:
an unconfigured aio - only started - will contact these 2 addresses, again via their container-ip:

172.18.0.10:53992 -> 151.101.246.133:443
172.18.0.4:54060 -> 153.207.154.104.bc.googleusercontent.com:443

its hard to know whats behind the googleusercontent thing.
the other address belongs to fastly, inc.

tia,tja…

If it cannot contact itself is something incorrectly configured in your network as this must work in order to make Nextcloud Office and Nextcloud Talk work correctly.

hi szaimen,

its your statement that the container has to talk to itself:

The container must be able to talk to itself. So it is normal that it tries to do so. Possibly allowing these requests in your firewall will speed up the initial startup.

my reasoning is that - if that would be the error - then it makes no sense that it trys to talk itself via messages to private lan addresses routed via the external interface.

from my point of view these messages are obviously in error and should be debugged - but this is a completely different issue and only came up as i debugged “that the container had to be able to talk to itself”.

lets make a step back:

  • aio is installed on a plain debian vm with nothing else on it
  • docker is installed via dockers repo
  • the vm is on a private lan subnet behind a firewall and is reachable via 1-to-1 NAT ( Network address translation - Wikipedia ) on ports 80,443,8080,8443
  • all the network ports are tested before installing aio (via tcpdump etc)

i think it cant be much more basic then that.

i install with …

docker run \
--sig-proxy=false \
--name nextcloud-aio-mastercontainer \
--restart always \
--publish 80:80 \
--publish 8080:8080 \
--publish 8443:8443 \
--volume nextcloud_aio_mastercontainer:/mnt/docker-aio-config \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--env NEXTCLOUD_DATADIR="/data/nextcloud" \
--env AIO_DISABLE_BACKUP_SECTION=true \
--env NEXTCLOUD_MEMORY_LIMIT=2048M \
--env SKIP_DOMAIN_VALIDATION=true \
--add-host $FQDN:$EXTERNALIP \
nextcloud/all-in-one:latest

… enter my domain, check all boxes except “Nextcloud Talk”, hit start and after some time the web page shows that all is running except apache and nextcloud which stay “Starting”.

viewing the logs of nextcloud-aio-nextcloud gives …

+ '[' -f /dev-dri-group-was-added ']'
++ find /dev -maxdepth 1 -mindepth 1 -name dri
+ '[' -n '' ']'
+ set +x
Installing imagemagick via apk...

… and that and the aio page will never change.

tia,tja…

I can only tell that I cannot reproduce your problem. Everything works perfectly on my test instance. So I dont know what is causing this on your side.

I’m getting this as well. Nextcloud AIO v5.1.0 running on Synology DSM 7.2 via a “Project” docker compose file. Hangs on Installing imagemagick via apk... just like tja’s example. All the other containers are up, and waiting for the nextcloud-aio-nextcloud container to do something. I’m wondering if maybe the imagemagick repository is down or such? Would it not time out? Can somebody tell me this repository so I can check connectivity?

After thinking about this a bit, I think I had this once in the past as well when ipv6-support was partially enabled in the docker daemon. Following all-in-one/docker-ipv6-support.md at main · nextcloud/all-in-one · GitHub could possibly resolve this.

I have all the IP v6 stuff commented out in the docker compose file.

I am using SKIP_DOMAIN_VALIDATION=true the same as @tja as I am expecting the reverse proxy will do all the certificate stuff. Not sure this would have any impact on it installing imagemagick though. Only commonality I can think of as we are on different platforms etc.

So ipv6": false, is set in your docker daemon?

idk. It’s disabled in the Synology GUI for the two NextCloud networks:

I went in and deleted all the Imagemagick environment variables reference and now it is running through the install, so there is definitely an issue with the installer connecting to the Imagemagick repository I think. Not sure how it’s going to run without it, but I’m treating this as a test system for now.

But did you also make sure that the docker daemon has ipv6 disabled?

Can you post the output of sudo netstat -tulpn | grep docker ?

Bit of a Docker newbie tbh and I don’t actually know where to find the docker deamon file on Synology :face_with_hand_over_mouth: Here is the netstat output though:

tcp 0 0 0.0.0.0:9010 0.0.0.0:* LISTEN 2629/docker-proxy
tcp 0 0 0.0.0.0:8083 0.0.0.0:* LISTEN 2794/docker-proxy
tcp 0 0 0.0.0.0:5556 0.0.0.0:* LISTEN 21126/docker-proxy
tcp 0 0 0.0.0.0:11000 0.0.0.0:* LISTEN 7295/docker-proxy
tcp 0 0 0.0.0.0:8090 0.0.0.0:* LISTEN 24776/docker-proxy
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 2655/docker-proxy
tcp 0 0 0.0.0.0:3012 0.0.0.0:* LISTEN 2583/docker-proxy
tcp 0 0 0.0.0.0:3141 0.0.0.0:* LISTEN 2414/docker-proxy
tcp 0 0 0.0.0.0:3080 0.0.0.0:* LISTEN 2523/docker-proxy
tcp6 0 0 :::9010 :::* LISTEN 2635/docker-proxy
tcp6 0 0 :::8083 :::* LISTEN 2801/docker-proxy
tcp6 0 0 :::5556 :::* LISTEN 21133/docker-proxy
tcp6 0 0 :::11000 :::* LISTEN 7302/docker-proxy
tcp6 0 0 :::8090 :::* LISTEN 24783/docker-proxy
tcp6 0 0 :::8000 :::* LISTEN 2663/docker-proxy
tcp6 0 0 :::3012 :::* LISTEN 2591/docker-proxy
tcp6 0 0 :::3141 :::* LISTEN 2422/docker-proxy
tcp6 0 0 :::3080 :::* LISTEN 2530/docker-proxy

I did notice that both Apache and the domain validation container were trying to use port 11000 for some reason. The domain one died but I have it bypassed anyway.

NextCloud made it a lot further, and now telling me this in the log:

Current version is 25.0.6.

Update to Nextcloud 25.0.7 available. (channel: “beta”)
Following file will be downloaded automatically: https://download.nextcloud.com/server/releases/nextcloud-25.0.7.zip
Open changelog :arrow_upper_right:

Updater run in non-interactive mode.

date

Info: Pressing Ctrl-C will finish the currently running step and then stops the updater.

Check for expected files …
] Check for expected files
Check for write permissions …
] Check for write permissions
Create backup …
] Create backup

tcp6 0 0 :::9010 :::* LISTEN 2635/docker-proxy

So apparentlx docker is indeed using ipv6. Can you disable ipv6 by setting "ipv6": false, in the docker daemon.json ?

If you dont want to disable ipv6 completely, please enable ipv6 correctly by following all-in-one/docker-ipv6-support.md at main · nextcloud/all-in-one · GitHub.