Apache Service won't Start after Running Certbot

mark3 · March 15, 2020, 3:32pm

Nextcloud version: 18.0.1
Operating system and version: Ubuntu 18.04
Apache or nginx version: Apache 2.4.29
PHP version: 7.2.7

Hi, when I first setup NextCloud, I downloaded the VirtualBox Appliance file. For a long time, I just had a self-signed cert for SSL purposes, but I got annoyed with always having to bypass the security screen when the cert wasn’t installed locally. I finally decided to get a cert from Let’s Encrypt. To do this, I followed these steps using Certbot: https://certbot.eff.org/lets-encrypt/ubuntubionic-nginx (Before starting I put NextCloud into maintenance mode)

After running sudo certbot --nginx I got some exception messages saying that nginx could not be restarted.

Certbot Exceptions

2020-03-15 12:49:51,673:DEBUG:certbot.error_handler:Encountered exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 75, in handle_authorizations
    resp = self._solve_challenges(aauthzrs)
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 139, in _solve_challenges
    resp = self.auth.perform(all_achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1071, in perform
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 881, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1141, in nginx_restart
    "nginx restart failed:\n%s\n%s" % (out.read(), err.read()))
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''

2020-03-15 12:49:51,673:DEBUG:certbot.error_handler:Calling registered functions
2020-03-15 12:49:51,673:INFO:certbot.auth_handler:Cleaning up challenges
2020-03-15 12:49:54,387:ERROR:certbot.error_handler:Encountered exception during recovery: 
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 75, in handle_authorizations
    resp = self._solve_challenges(aauthzrs)
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 139, in _solve_challenges
    resp = self.auth.perform(all_achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1071, in perform
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 881, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1141, in nginx_restart
    "nginx restart failed:\n%s\n%s" % (out.read(), err.read()))
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/error_handler.py", line 108, in _call_registered
    self.funcs[-1]()
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 323, in _cleanup_challenges
    self.auth.cleanup(achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1090, in cleanup
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 881, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1141, in nginx_restart
    "nginx restart failed:\n%s\n%s" % (out.read(), err.read()))
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''
2020-03-15 12:49:54,408:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/usr/bin/certbot", line 11, in <module>
    load_entry_point('certbot==0.31.0', 'console_scripts', 'certbot')()
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1365, in main
    return config.func(config, plugins)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1119, in run
    certname, lineage)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 121, in _get_and_save_cert
    lineage = le_client.obtain_and_enroll_certificate(domains, certname)
  File "/usr/lib/python3/dist-packages/certbot/client.py", line 410, in obtain_and_enroll_certificate
    cert, chain, key, _ = self.obtain_certificate(domains)
  File "/usr/lib/python3/dist-packages/certbot/client.py", line 353, in obtain_certificate
    orderr = self._get_order_and_authorizations(csr.data, self.config.allow_subset_of_names)
  File "/usr/lib/python3/dist-packages/certbot/client.py", line 389, in _get_order_and_authorizations
    authzr = self.auth_handler.handle_authorizations(orderr, best_effort)
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 75, in handle_authorizations
    resp = self._solve_challenges(aauthzrs)
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 139, in _solve_challenges
    resp = self.auth.perform(all_achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1071, in perform
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 881, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1141, in nginx_restart
    "nginx restart failed:\n%s\n%s" % (out.read(), err.read()))
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''

I don’t really know much about the inner workings of NextCloud or nginx, so I took NextCloud out of maintenance mode and rebooted the server hoping that would help. Well, it didn’t. Now, I can’t even get apache running. I don’t know how, but I feel like I royally screwed something up and may need to resort to doing a fresh install of NextCloud (honestly, I’d be okay with this as long as I could make sure I can transfer settings and files).

Logs

The config.php file in /path/to/nextcloud:

<?php
$CONFIG = array (
  'passwordsalt' => '[salt]',
  'secret' => '[secret]',
  'trusted_domains' => 
  array (
    0 => '192.168.1.2',
    1 => 'mysite.com'
  ),
  'datadirectory' => '/mnt/ncdata',
  'overwrite.cli.url' => 'https://nextcloud/',
  'dbtype' => 'pgsql',
  'version' => '18.0.1.3',
  'dbname' => 'nextcloud_db',
  'dbhost' => 'localhost',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'ncadmin',
  'dbpassword' => '[password]',
  'installed' => true,
  'instanceid' => 'oc25e565hl6o',
  'mail_smtpmode' => 'smtp',
  'log_rotate_size' => '10485760',
  'memcache.local' => '\\OC\\Memcache\\Redis',
  'filelocking.enabled' => true,
  'memcache.distributed' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => '/var/run/redis/redis-server.sock',
    'port' => 0,
    'timeout' => 0.5,
    'dbindex' => 0,
    'password' => '[password]',
  ),
  'htaccess.RewriteBase' => '/',
  'loglevel' => 0,
  'log_type' => 'file',
  'logfile' => '/mnt/ncdata/nextcloud.log',
  'logtimezone' => 'Europe/Stockholm',
  'enable_previews' => true,
  'preview_libreoffice_path' => '/usr/bin/libreoffice',
  'maintenance' => false,
  'theme' => '',
);

The output of /var/log/apache2/error.log:

[Sun Mar 15 06:25:18.936193 2020] [ssl:warn] [pid 2139] AH01909: 127.0.1.1:443:0 server certificate does NOT include an ID which matches the server name
[Sun Mar 15 06:25:18.936245 2020] [http2:warn] [pid 2139] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Sun Mar 15 06:25:18.980557 2020] [mpm_prefork:notice] [pid 2139] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.0g configured -- resuming normal operations
[Sun Mar 15 06:25:18.980566 2020] [core:notice] [pid 2139] AH00094: Command line: '/usr/sbin/apache2'
[Sun Mar 15 12:59:46.768864 2020] [mpm_prefork:notice] [pid 2139] AH00169: caught SIGTERM, shutting down

The output of /var/log/nginx/error.log:

2020/03/15 12:49:49 [notice] 10676#10676: signal process started
2020/03/15 12:49:49 [error] 10676#10676: invalid PID number "" in "/run/nginx.pid"
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:49 [emerg] 10677#10677: still could not bind()
2020/03/15 12:49:51 [notice] 10678#10678: signal process started
2020/03/15 12:49:51 [error] 10678#10678: invalid PID number "" in "/run/nginx.pid"
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to 0.0.0.0:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: bind() to [::]:80 failed (98: Address already in use)
2020/03/15 12:49:51 [emerg] 10679#10679: still could not bind()
2020/03/15 13:55:20 [notice] 3881#3881: signal process started

The output of /var/log/syslog:

...
Mar 15 13:00:57 nextcloud systemd[1]: Started OpenBSD Secure Shell server.
Mar 15 13:00:57 nextcloud apport[1873]:    ...done.
Mar 15 13:00:57 nextcloud systemd[1]: Started LSB: automatic crash report generation.
Mar 15 13:00:57 nextcloud iscsid: iSCSI daemon with pid=1862 started!
Mar 15 13:00:58 nextcloud systemd[1]: nginx.service: Failed to parse PID from file /run/nginx.pid: Invalid argument
Mar 15 13:00:58 nextcloud systemd[1]: Started A high performance web server and a reverse proxy server.
Mar 15 13:01:00 nextcloud apachectl[1870]: AH00558: apache2: Could not reliably determine the server''s fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Mar 15 13:01:01 nextcloud apachectl[1870]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
Mar 15 13:01:01 nextcloud apachectl[1870]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
Mar 15 13:01:01 nextcloud apachectl[1870]: no listening sockets available, shutting down
Mar 15 13:01:01 nextcloud apachectl[1870]: AH00015: Unable to open logs
Mar 15 13:01:01 nextcloud systemd[1]: apache2.service: Control process exited, code=exited status=1
Mar 15 13:01:01 nextcloud apachectl[1870]: Action 'start' failed.
Mar 15 13:01:01 nextcloud apachectl[1870]: The Apache error log may have more information.
Mar 15 13:01:01 nextcloud systemd[1]: apache2.service: Failed with result 'exit-code'.
Mar 15 13:01:01 nextcloud systemd[1]: Failed to start The Apache HTTP Server.
Mar 15 13:01:01 nextcloud fail2ban-server[1740]: Server ready
...

apache2.conf

DefaultRuntimeDir ${APACHE_RUN_DIR}

PidFile ${APACHE_PID_FILE}

Timeout 300

KeepAlive On

MaxKeepAliveRequests 100

KeepAliveTimeout 5


# These need to be set in /etc/apache2/envvars
User ${APACHE_RUN_USER}
Group ${APACHE_RUN_GROUP}

HostnameLookups Off

ErrorLog ${APACHE_LOG_DIR}/error.log

LogLevel warn

# Include module configuration:
IncludeOptional mods-enabled/*.load
IncludeOptional mods-enabled/*.conf

# Include list of ports to listen on
Include ports.conf


<Directory />
	Options FollowSymLinks
	AllowOverride None
	Require all denied
</Directory>

<Directory /usr/share>
	AllowOverride None
	Require all granted
</Directory>

<Directory /var/www/>
	Options Indexes FollowSymLinks
	AllowOverride None
	Require all granted
</Directory>

#<Directory /srv/>
#	Options Indexes FollowSymLinks
#	AllowOverride None
#	Require all granted
#</Directory>


AccessFileName .htaccess

<FilesMatch "^\.ht">
	Require all denied
</FilesMatch>


LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent


# Include generic snippets of statements
IncludeOptional conf-enabled/*.conf

# Include the virtual host configurations:
IncludeOptional sites-enabled/*.conf

# vim: syntax=apache ts=4 sw=4 sts=4 sr noet

Running apache2ctl -S gives this, which looks like it’s still trying to use my old self-signed cert.

AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
VirtualHost configuration:
*:80                   is a NameVirtualHost
                default server 127.0.1.1 (/etc/apache2/sites-enabled/000-default.conf:1)
                port 80 namevhost 127.0.1.1 (/etc/apache2/sites-enabled/000-default.conf:1)
                port 80 namevhost 127.0.1.1 (/etc/apache2/nextcloud_http_domain_self_signed.conf:1)
*:443                  127.0.1.1 (/etc/apache2/sites-enabled/nextcloud_ssl_domain_self_signed.conf:1)
ServerRoot: "/etc/apache2"
Main DocumentRoot: "/var/www/html"
Main ErrorLog: "/var/log/apache2/error.log"
Mutex watchdog-callback: using_defaults
Mutex rewrite-map: using_defaults
Mutex ssl-stapling-refresh: using_defaults
Mutex ssl-stapling: using_defaults
Mutex proxy: using_defaults
Mutex ssl-cache: using_defaults
Mutex default: dir="/var/run/apache2" mechanism=default
Mutex mpm-accept: using_defaults
PidFile: "/var/run/apache2/apache2.pid"
Define: DUMP_VHOSTS
Define: DUMP_RUN_CFG
User: name="www-data" id=33
Group: name="www-data" id=33

Trying to start apache by running apache2 -k start results in this message

AH00111: Config variable ${APACHE_RUN_DIR} is not defined
apache2: Syntax error on line 80 of /etc/apache2/apache2.conf: DefaultRuntimeDir must be a valid directory, absolute or relative to ServerRoot

I don’t know why it’s coming back as not defined because it is in the envvars file. Maybe it has something to do with the SUFFIX?

envvars


# envvars - default environment variables for apache2ctl

# this won't be correct after changing uid
unset HOME

# for supporting multiple apache2 instances
if [ "${APACHE_CONFDIR##/etc/apache2-}" != "${APACHE_CONFDIR}" ] ; then
	SUFFIX="-${APACHE_CONFDIR##/etc/apache2-}"
else
	SUFFIX=
fi

# Since there is no sane way to get the parsed apache2 config in scripts, some
# settings are defined via environment variables and then used in apache2ctl,
# /etc/init.d/apache2, /etc/logrotate.d/apache2, etc.
export APACHE_RUN_USER=www-data
export APACHE_RUN_GROUP=www-data
# temporary state file location. This might be changed to /run in Wheezy+1
export APACHE_PID_FILE=/var/run/apache2$SUFFIX/apache2.pid
export APACHE_RUN_DIR=/var/run/apache2$SUFFIX
export APACHE_LOCK_DIR=/var/lock/apache2$SUFFIX
# Only /var/log/apache2 is handled by /etc/logrotate.d/apache2.
export APACHE_LOG_DIR=/var/log/apache2$SUFFIX

## The locale used by some modules like mod_dav
export LANG=C
## Uncomment the following line to use the system default locale instead:
#. /etc/default/locale

export LANG

When I try to start apache with sudo service apache2 start I get this:

Job for apache2.service failed because the control process exited with error code.
See "systemctl status apache2.service" and "journalctl -xe" for details.

Both of these reveal the same information:

"systemctl status apache2.service" and "journalctl -xe"

* apache2.service - The Apache HTTP Server
   Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/apache2.service.d
           `-apache2-systemd.conf
   Active: failed (Result: exit-code) since Sun 2020-03-15 16:07:50 CET; 5min ago
  Process: 4063 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)

Mar 15 16:07:50 nextcloud apachectl[4063]: AH00558: apache2: Could not reliably determine the servers fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Mar 15 16:07:50 nextcloud apachectl[4063]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
Mar 15 16:07:50 nextcloud apachectl[4063]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
Mar 15 16:07:50 nextcloud apachectl[4063]: no listening sockets available, shutting down
Mar 15 16:07:50 nextcloud apachectl[4063]: AH00015: Unable to open logs
Mar 15 16:07:50 nextcloud apachectl[4063]: Action 'start' failed.
Mar 15 16:07:50 nextcloud apachectl[4063]: The Apache error log may have more information.
Mar 15 16:07:50 nextcloud systemd[1]: apache2.service: Control process exited, code=exited status=1
Mar 15 16:07:50 nextcloud systemd[1]: apache2.service: Failed with result 'exit-code'.
Mar 15 16:07:50 nextcloud systemd[1]: Failed to start The Apache HTTP Server.

I’m at a complete loss on what to do to get my NextCloud server back up and running. Any help would be greatly appreciated.

mark3 · March 17, 2020, 4:27am

I ended up just setting up a new server and migrating my data and database to that. This seemed to work. I’m not sure what happened when I ran Certbot to break everything, but I figured it was easier to migrate than to debug to figure it out.