I have recently got a new Nextcloud server up and running using docker compose (and Portainer).
It seems to work fine, no real problems or anything. But there is an error repeating multiple times every day in the nextcloud log. It is expressed in a few different ways, and in the Application column in the log in differs between index, PHP and core. But the main messgae is the same:
Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] could not translate host name "postgres" to address: Temporary failure in name resolution
The “postgres” in the error is (I assume) the name of my postgres container.
I´m not sure how to go about solving this. Is this a DNS issue? Could it be something else? What can i do to find out what is causing this?
The whole Nextcloud instance, and all of its containers, are in the same Portainer stack. Two networks are used. nextcloud-network and traefik-network. I created them using this command:
docker network create network-name
My containers : network(s):
collabora : nextcloud-network, traefik-network
nextcloud: nextcloud-network, traefik-network
nextcloud-backups: nextcloud-network
postgres: nextcloud-network
redis: nextcloud-network
traefik: traefik-network
Any obviuos error / mistake there? If someone can help, I´d be grateful.
Please let me know if more info is needed.
jtr
July 30, 2024, 12:39pm
2
What Docker Engine version?
Please also post your Compose file.
Sure,
sudo docker version
returns this:
docker@docker:~$ sudo docker version
[sudo] password for docker:
Client: Docker Engine - Community
Version: 27.0.3
API version: 1.46
Go version: go1.21.11
Git commit: 7d4bcd8
Built: Sat Jun 29 00:02:33 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 27.0.3
API version: 1.46 (minimum version 1.24)
Go version: go1.21.11
Git commit: 662f78c
Built: Sat Jun 29 00:02:33 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.7.18
GitCommit: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
runc:
Version: 1.7.18
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
GitCommit: de40ad0
My compose file is based on the one from here: https://www.heyvaldemar.com/install-nextcloud-using-docker-compose.
I have edited the above, so my whole compose file, or Portainer stack as I use it, looks like this now (I have removed comments and tried to obscure values. I have variables loaded in Portainer stack, originally imported from .env file):
networks:
nextcloud-network:
external: true
traefik-network:
external: true
volumes:
nextcloud-data:
nextcloud-config:
redis-data:
nextcloud-postgres:
nextcloud-postgres-backup:
nextcloud-data-backups:
nextcloud-database-backups:
traefik-certificates:
services:
postgres:
container_name: postgres
image: ${NEXTCLOUD_POSTGRES_IMAGE_TAG}
volumes:
- nextcloud-postgres:/var/lib/postgresql/data
environment:
TZ: ${NEXTCLOUD_TIMEZONE}
POSTGRES_DB: ${NEXTCLOUD_DB_NAME}
POSTGRES_USER: ${NEXTCLOUD_DB_USER}
POSTGRES_PASSWORD: ${NEXTCLOUD_DB_PASSWORD}
networks:
- nextcloud-network
healthcheck:
test: [ "CMD", "pg_isready", "-q", "-d", "${NEXTCLOUD_DB_NAME}", "-U", "${NEXTCLOUD_DB_USER}" ]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s
restart: unless-stopped
redis:
environment:
TZ: ${NEXTCLOUD_TIMEZONE}
image: ${NEXTCLOUD_REDIS_IMAGE_TAG}
container_name: redis
command: ["redis-server", "--requirepass", "$NEXTCLOUD_REDIS_PASSWORD"]
volumes:
- redis-data:/data
networks:
- nextcloud-network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s
restart: unless-stopped
nextcloud:
# hostname: ${NEXTCLOUD_HOSTNAME}
image: ${NEXTCLOUD_IMAGE_TAG}
container_name: nextcloud
volumes:
- nextcloud-data:${DATA_PATH}
- nextcloud-config:${DATA_PATH}
- /mnt/nextcloud/data:/var/www/html/data
- /mnt/nextcloud/config:/var/www/html/config
environment:
TZ: ${NEXTCLOUD_TIMEZONE}
POSTGRES_HOST: postgres
DB_PORT: 5432
POSTGRES_DB: ${NEXTCLOUD_DB_NAME}
POSTGRES_USER: ${NEXTCLOUD_DB_USER}
POSTGRES_PASSWORD: ${NEXTCLOUD_DB_PASSWORD}
REDIS_HOST: redis
REDIS_HOST_PORT: 6379
REDIS_HOST_PASSWORD: ${NEXTCLOUD_REDIS_PASSWORD}
NEXTCLOUD_ADMIN_USER: ${NEXTCLOUD_ADMIN_USERNAME}
NEXTCLOUD_ADMIN_PASSWORD: ${NEXTCLOUD_ADMIN_PASSWORD}
NEXTCLOUD_TRUSTED_DOMAINS: ${NEXTCLOUD_HOSTNAME}
TRUSTED_PROXIES: ${TRAEFIK_IP}
OVERWRITECLIURL: ${NEXTCLOUD_URL}
OVERWRITEPROTOCOL: https
OVERWRITEHOST: ${NEXTCLOUD_HOSTNAME}
networks:
- nextcloud-network
- traefik-network
expose:
- "443"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:80/"]
interval: 10s
timeout: 5s
retries: 3
start_period: 90s
labels:
- "traefik.enable=true"
- "traefik.http.routers.nextcloud.rule=Host(`${NEXTCLOUD_HOSTNAME}`)"
- "traefik.http.routers.nextcloud.service=nextcloud"
- "traefik.http.routers.nextcloud.entrypoints=websecure"
- "traefik.http.services.nextcloud.loadbalancer.server.port=80"
- "traefik.http.routers.nextcloud.tls=true"
- "traefik.http.routers.nextcloud.tls.certresolver=letsencrypt"
- "traefik.http.services.nextcloud.loadbalancer.passhostheader=true"
- "traefik.http.routers.nextcloud.middlewares=compresstraefik"
- "traefik.http.middlewares.compresstraefik.compress=true"
- "traefik.docker.network=traefik-network"
#HSTS
- "traefik.http.routers.nextcloud.middlewares=nextcloudHeader"
- "traefik.http.middlewares.nextcloudHeader.headers.stsSeconds=15552000"
- "traefik.http.middlewares.nextcloudHeader.headers.stsIncludeSubdomains=true"
- "traefik.http.middlewares.nextcloudHeader.headers.stsPreload=true"
- "traefik.http.middlewares.nextcloudHeader.headers.forceSTSHeader=true"
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
traefik:
condition: service_healthy
nextcloud-collabora:
image: collabora/code:latest
container_name: collabora
restart: unless-stopped
ports:
- 127.0.0.1:9980:9980
expose:
- "9980"
environment:
#should work as "domain=cloud1\.nextcloud\.com|cloud2\.nextcloud\.com"
- TZ=${NEXTCLOUD_TIMEZONE}
- domain=${COLLABORA_DOMAIN}
- aliasgroup1=cloud.example.com
- 'dictionaries=en_US,se_SE'
- VIRTUAL_PROTO=http
- VIRTUAL_PORT=9980
- VIRTUAL_HOST=${COLLABORA_HOSTNAME}
- username=${COLLABORA_USERNAME}
- password=${COLLABORA_PASSWORD}
- "extra_params=--o:ssl.enable=false --o:ssl.termination=true"
networks:
- nextcloud-network
- traefik-network
cap_add:
- MKNOD
tty: true
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik-network"
- "traefik.http.routers.collabora.rule=Host(`office.example.com`)"
- "traefik.http.routers.collabora.entrypoints=web"
- "traefik.http.middlewares.collabora-https-redirect.redirectscheme.scheme=https"
- "traefik.http.routers.collabora.middlewares=collabora-https-redirect"
- "traefik.http.routers.collabora-secure.entrypoints=websecure"
- "traefik.http.routers.collabora-secure.rule=Host(`office.example.com`)"
- "traefik.http.routers.collabora-secure.tls=true"
- "traefik.http.routers.collabora-secure.tls.certresolver=letsencrypt"
traefik:
image: ${TRAEFIK_IMAGE_TAG}
container_name: traefik
environment:
TZ: ${NEXTCLOUD_TIMEZONE}
command:
- "--log.level=${TRAEFIK_LOG_LEVEL}"
- "--accesslog=true"
- "--api.dashboard=true"
- "--api.insecure=true"
- "--ping=true"
- "--ping.entrypoint=ping"
- "--entryPoints.ping.address=:8082"
- "--entryPoints.web.address=:80"
- "--entryPoints.websecure.address=:443"
- "--providers.docker=true"
- "--providers.docker.endpoint=unix:///var/run/docker.sock"
- "--providers.docker.exposedByDefault=false"
- "--certificatesresolvers.letsencrypt.acme.tlschallenge=true"
# - "--certificatesresolvers.letsencrypt.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
- "--certificatesresolvers.letsencrypt.acme.email=${TRAEFIK_ACME_EMAIL}"
- "--certificatesresolvers.letsencrypt.acme.storage=/etc/traefik/acme/acme.json"
- "--metrics.prometheus=true"
- "--metrics.prometheus.buckets=0.1,0.3,1.2,5.0"
- "--global.checkNewVersion=true"
- "--global.sendAnonymousUsage=false"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- traefik-certificates:/etc/traefik/acme
networks:
- traefik-network
ports:
- "80:80"
- "8081:8080"
- "443:443"
healthcheck:
test: ["CMD", "wget", "http://localhost:8082/ping","--spider"]
interval: 10s
timeout: 5s
retries: 3
start_period: 5s
labels:
- "traefik.enable=true"
- "traefik.http.routers.dashboard.rule=Host(`${TRAEFIK_HOSTNAME}`)"
- "traefik.http.routers.dashboard.service=api@internal"
- "traefik.http.routers.dashboard.entrypoints=websecure"
- "traefik.http.services.dashboard.loadbalancer.server.port=8080"
- "traefik.http.routers.dashboard.tls=true"
- "traefik.http.routers.dashboard.tls.certresolver=letsencrypt"
- "traefik.http.services.dashboard.loadbalancer.passhostheader=true"
- "traefik.http.routers.dashboard.middlewares=authtraefik"
- "traefik.http.middlewares.authtraefik.basicauth.users=${TRAEFIK_BASIC_AUTH}"
- "traefik.http.routers.http-catchall.rule=HostRegexp(`{host:.+}`)"
- "traefik.http.routers.http-catchall.entrypoints=web"
- "traefik.http.routers.http-catchall.middlewares=redirect-to-https"
- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
restart: unless-stopped
backups:
image: ${NEXTCLOUD_POSTGRES_IMAGE_TAG}
container_name: nextcloud-backups
command: >-
sh -c 'sleep $BACKUP_INIT_SLEEP &&
while true; do
pg_dump -h postgres -p 5432 -d $NEXTCLOUD_DB_NAME -U $NEXTCLOUD_DB_USER | gzip > $POSTGRES_BACKUPS_PATH/$POSTGRES_BACKUP_NAME-$(date "+%Y-%m-%d_%H-%M").gz &&
tar -zcpf $DATA_BACKUPS_PATH/$DATA_BACKUP_NAME-$(date "+%Y-%m-%d_%H-%M").tar.gz $DATA_PATH &&
find $POSTGRES_BACKUPS_PATH -type f -mtime +$POSTGRES_BACKUP_PRUNE_DAYS | xargs rm -f &&
find $DATA_BACKUPS_PATH -type f -mtime +$DATA_BACKUP_PRUNE_DAYS | xargs rm -f;
sleep $BACKUP_INTERVAL; done'
volumes:
- nextcloud-postgres-backup:/var/lib/postgresql/data
- nextcloud-data:${DATA_PATH}
- nextcloud-data-backups:${DATA_BACKUPS_PATH}
- nextcloud-database-backups:${POSTGRES_BACKUPS_PATH}
environment:
TZ: ${NEXTCLOUD_TIMEZONE}
NEXTCLOUD_DB_NAME: ${NEXTCLOUD_DB_NAME}
NEXTCLOUD_DB_USER: ${NEXTCLOUD_DB_USER}
PGPASSWORD: ${NEXTCLOUD_DB_PASSWORD}
BACKUP_INIT_SLEEP: ${BACKUP_INIT_SLEEP}
BACKUP_INTERVAL: ${BACKUP_INTERVAL}
POSTGRES_BACKUP_PRUNE_DAYS: ${POSTGRES_BACKUP_PRUNE_DAYS}
DATA_BACKUP_PRUNE_DAYS: ${DATA_BACKUP_PRUNE_DAYS}
POSTGRES_BACKUPS_PATH: ${POSTGRES_BACKUPS_PATH}
DATA_BACKUPS_PATH: ${DATA_BACKUPS_PATH}
DATA_PATH: ${DATA_PATH}
POSTGRES_BACKUP_NAME: ${POSTGRES_BACKUP_NAME}
DATA_BACKUP_NAME: ${DATA_BACKUP_NAME}
networks:
- nextcloud-network
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
wwe
July 31, 2024, 11:32am
4
I don’t see the problem in your compose. few recommendations but nothing serious.
you don’t really need an external network nextcloud-network
as compose itself creates an internal network {project-default}
but this is only a small improvement and not a reason of your problem…
this looks like you mount data and config twice?
I think you have to troubleshoot your docker installation
first check if postgres container maybe restarts from time to time for some reason? use docker compose ps
to see the uptime of containers… use docker logs postgres
to see postgres container logs
review containers using docker inspect {container name}
(double check if something weird is in "Dns": [], "DnsOptions": [], "DnsSearch": [], "ExtraHosts": [],
)
review networks with docker network inspect nextcloud-network
to verify the container is always connected to the network…
continously run docker compose exec nextcloud getent hosts postgres
(or docker-compose for old compose versions) to verify DNS resolution for “postgres” on “nextcloud” container - if it fails/changes this could provide some hints…
This really looks like some great pointers for me to dive into. I am away from this stuff for a while now, but will look into this when I get back next week. I will get back with results and probably more questions. Thank you so much for now!