Web login broken after upgrade to 19

Nextcloud version: 19.0.1
Operating system and version: Debian 10
Apache or nginx version: 2.4.38-3+deb10u3
PHP version: 7.4.8

Logging in via the web interface doesn’t work anymore after upgrading Nextcloud from version 18 to 19. Before the upgrade everything worked fine.

The android Nextcloud app and caldav clients can still access Nextcloud, I assume via its API, with the same credentials.

The problem also occurs trying to log in from a different computer, even also from an android phone.

Nextcloud runs in a docker container (image nextcloud:19.0.1), in a Kubernetes cluster. That was not a problem before the Nextcloud upgrade.

Is this the first time you’ve seen this error? : yes.

Steps to replicate it:

  1. Use a browser to access the Nextcloud web interface.
  2. Enter credentials in the login screen, and click “Log in”.
  3. The resulting page is the login screen again.

Some extra information:

The output of your Nextcloud log in Admin > Logging:
I cannot access those because of the login problem.

/var/www/html/config/config.php:

<?php
$CONFIG = array (
  'htaccess.RewriteBase' => '/',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'apps_paths' => 
  array (
    0 => 
    array (
      'path' => '/var/www/html/apps',
      'url' => '/apps',
      'writable' => false,
    ),
    1 => 
    array (
      'path' => '/var/www/html/custom_apps',
      'url' => '/custom_apps',
      'writable' => true,
    ),
  ),
  'passwordsalt' => '[...]',
  'secret' => '[...]',
  'trusted_domains' => 
  array (
    0 => '[...]',
  ),
  'datadirectory' => '/var/www/html/data',
  'dbtype' => 'mysql',
  'version' => '19.0.1.1',
  'overwritehost' => '[...]',
  'overwriteprotocol' => 'https',
  'dbname' => 'nextcloud',
  'dbhost' => 'datni-mariadb',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'instanceid' => '[...]',
  'mysql.utf8mb4' => true,
  'dbuser' => 'nextcloud',
  'dbpassword' => '[...]',
  'installed' => true,
  'ldapIgnoreNamingRules' => false,
  'ldapProviderFactory' => 'OCA\\User_LDAP\\LDAPProviderFactory',
  'maintenance' => false,
  'theme' => '',
  'loglevel' => 2,
);

Webserver logs of a login attempt:

10.42.0.208 - - [20/Sep/2020:20:21:20 +0000] "GET /csrftoken HTTP/1.1" 200 806 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
10.42.0.208 - - [20/Sep/2020:20:21:40 +0000] "POST /login HTTP/1.1" 200 4829 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
10.42.0.208 - - [20/Sep/2020:20:21:40 +0000] "GET /apps/theming/js/theming?v=5 HTTP/1.1" 200 955 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
10.42.0.208 - - [20/Sep/2020:20:21:40 +0000] "GET /apps/accessibility/js/accessibility?v=0 HTTP/1.1" 200 845 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"
10.42.0.208 - - [20/Sep/2020:20:21:40 +0000] "GET /cron.php HTTP/1.1" 200 875 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"

Interestingly the POST to /login seems successful, it returns a 200 code.

This is still unsolved unfortunately. I set up a new nextcloud instance with the same version and the same ldap settings to compare it with. That one works without problems, so there’s something in the history of the broken instance that makes it not work, perhaps due to the nextcloud version upgrade?

Comparing the apache logs for both instances, the working one gives a 303 response to the POST to /login, whereas the broken one gives a 200. I still don’t know why though.

This was never really solved. I finally recovered by creating a new Nextcloud installation and manually copying all database contents and some files (user files, some app data) to the new installation.

Unconfirmed hypothesis: this problem is caused by an upgrade or migration process that is terminated prematurely by the pod’s liveness probe. I’ve for now disabled the liveness probe to prevent this problem with future updates, let’s hope it never returns.