Large multi-file upload via web gui causes unwanted logouts

Wanted to open a bug report on github - but not 100% sure what this is yet, so I’ll start here.

Nextcloud version: 16.0.4
Hardware: Rock64 (4GB RAM) specs on pine64.org
Operating system and version: DietPi v6.25.3 (Debian stretch | aarch64)
Apache or nginx version: nginx/1.10.3
PHP version: 7.3.8

The setup

Rock64 board is providing various services to internal network and also the web via name.tld with tls certificate from letsencrypt. Nginx acts as the surface layer. Nextcloud is served via cloud.name.tld.
/ is on a micro sd
attached a usb 3.0 ssd for file storage purposes and additional usb hdd for periodic self-mirroring backups.
for larger amounts of data it NFS mounts a Qnap NAS drive which is sitting next to it on the Gbit LAN.
For the 2 usb harddrives an externally powered usb hub was necessary to keep the board stable (especially during boot) but ever since it is running beautifully smooth and with all things still has quite a generous reserve of RAM (only needs ~330mb in normal operation)

But:

The issue you are facing:

Uploading a larger folder (test subject: 7.42GB of photos and videos) via drag and drop into the web interface has a very high chance of failure - after a variable amount of time the login of the uploading user seems to become invalid and the log gets spammed with the messages below. At the same time sessions of the same user on other devices may or may not get terminated and even sessions of other users as well. The behavior is pretty erratic and inconsistent but seems to correlate with I/O load on the device or network interface.
The rate of failure seems to be higher when uploading to a local type external storage (which is the NFS mounted NAS drive). So far one single time (out of >20 attempts) the upload actually completed onto a folder in my regular data dir. (edit: make that 2 times)

Is this the first time you’ve seen this error?:

Well, I’ve seen it many many times now during 2 nights of attempted troubleshooting.

Steps to replicate it:

  1. Log in
  2. Drop a somewhat larger folder into the files app
  3. Wait…
    At some point an obscure message will appear, either that folder x doesn’t exist anymore or is locked - followed by a stream of countless ‘Cannot authenticate over ajax calls’ until the page will reload to the login screen

The output of your Nextcloud log in Admin > Logging:

(chronologically inverse - newest on top)

Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:58:55+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:58:55+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:58:55+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:57:11+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:57:11+0200
....it goes on and on....
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:23+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:23+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:56:23+0200
Warning	core	Renewing session token failed		2019-08-17T05:56:23+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:23+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:23+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:23+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:22+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:22+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:56:22+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:56:22+0200
Warning	core	Login failed: 'username' (Remote IP: 'x.x.x.x')		2019-08-17T05:56:21+0200
Debug	core	OC_Image->fixOrientation() Orientation: 1		2019-08-17T05:51:30+0200
Debug	core	OC_Image->fixOrientation() Orientation: 1		2019-08-17T05:51:30+0200
Debug	core	OC_Image->fixOrientation() Orientation: 1		2019-08-17T05:51:29+0200
Debug	core	OC_Image->fixOrientation() Orientation: 1		2019-08-17T05:51:28+0200
Debug	core	OC_Image->fixOrientation() Orientation: 6		2019-08-17T05:51:27+0200
Debug	core	OC_Image->fixOrientation() Orientation: 6		2019-08-17T05:51:25+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:50:33+0200
Debug	core	OC\AppFramework\Middleware\Security\Exceptions\NotLoggedInException: Current user is not logged in		2019-08-17T05:50:33+0200
Debug	webdav	Sabre\DAV\Exception\NotAuthenticated: Cannot authenticate over ajax calls		2019-08-17T05:50:27+0200

The output of your config.php file in /path/to/nextcloud:

<?php
$CONFIG = array (
  'passwordsalt' => 'x',
  'secret' => 'x',
  'trusted_domains' => 
  array (
    0 => 'localhost',
    1 => 'cloud.name.tld',
  ),
  'datadirectory' => '/mnt/dietpi_userdata/nextcloud_data',
  'dbtype' => 'pgsql',
  'version' => '16.0.4.1',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'filelocking.enabled' => true,
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => '/var/run/redis/redis-server.sock',
    'port' => 0,
  ),
  'overwrite.cli.url' => 'https://cloud.name.tld/',
  'dbname' => 'cloud21',
  'dbhost' => '127.0.0.1',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => 'cloud21',
  'dbpassword' => 'x',
  'installed' => true,
  'instanceid' => 'x',
  'maintenance' => false,
  'loglevel' => 0,
  'theme' => '',
  'defaultapp' => 'deck',
  'session_lifetime' => 86400,
  'session_keepalive' => true,
  'remember_login_cookie_lifetime' => 1296000,
  'updater.secret' => 'x',
);

the nginx server config


server {

	add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

	root /var/www/nextcloud/;
	index index.php index.html index.htm;

	server_name cloud.name.tld;



	error_page 404 /404.html;
	error_page 500 502 503 504 /50x.html;
	location = /50x.html {
		root /var/www/nextcloud/;
	}


    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/cloud.name.tld/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/cloud.name.tld/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot







# Based on: https://docs.nextcloud.com/server/stable/admin_manual/installation/nginx.html


	# Add headers to serve security related headers
	add_header Strict-Transport-Security "max-age=15768000; includeSubDomains;";
	add_header X-Content-Type-Options nosniff;
	add_header X-XSS-Protection "1; mode=block";
	add_header X-Robots-Tag none;
	add_header X-Download-Options noopen;
	add_header X-Permitted-Cross-Domain-Policies none;
	add_header Referrer-Policy no-referrer;

	# Remove X-Powered-By, which is an information leak
	fastcgi_hide_header X-Powered-By;

	# Set max upload size
	client_max_body_size 1048576M;
	fastcgi_buffers 64 4K;
	client_body_temp_path /mnt/ssd/nginx/tmp/;

	# Enable gzip but do not remove ETag headers
	gzip on;
	gzip_vary on;
	gzip_comp_level 4;
	gzip_min_length 256;
	gzip_proxied expired no-cache no-store private no_last_modified no_etag auth;
	gzip_types application/atom+xml application/javascript application/json application/ld+json application/manifest+json application/rss+xml application/vnd.geo+json application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/bmp image/svg+xml image/x-icon text/cache-manifest text/css text/plain text/vcard text/vnd.rim.location.xloc text/vtt text/x-component text/x-cross-domain-policy;

	# Uncomment if your server is build with the ngx_pagespeed module
	# This module is currently not supported.
	#pagespeed off;

	location / {
		rewrite ^ /index.php$request_uri;
	}

	location ~ ^\/(?:build|tests|config|lib|3rdparty|templates|data)/ {
		deny all;
	}
	location ~ ^\/(?:\.|autotest|occ|issue|indie|db_|console) {
		deny all;
	}

	location ~ ^\/(?:index|remote|public|cron|core/ajax/update|status|ocs/v[12]|updater/.+|ocs-provider/.+)\.php(?:$|/) {
		fastcgi_split_path_info ^(.+?\.php)(/.*)$;
		include fastcgi_params;
		fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
		fastcgi_param PATH_INFO $fastcgi_path_info;
		# HTTPS forces redirection from http://, thus has to be enabled only on active HTTPS environment.
		fastcgi_param HTTPS $https;
		# Avoid sending the security headers twice
		fastcgi_param modHeadersAvailable true;
		fastcgi_param front_controller_active true;
		fastcgi_pass php;
		fastcgi_intercept_errors on;
		fastcgi_read_timeout 600s;
		# Disable on Jessie, because Jessie Nginx does not support this directive
		fastcgi_request_buffering off;
		# Hard coding 128M OPCache size to suppress warning on Nextcloud admin panel.
		fastcgi_param PHP_ADMIN_VALUE "opcache.memory_consumption=128";
        }

	location ~ ^\/(?:updater|ocs-provider)(?:$|/) {
		try_files $uri/ =404;
		index index.php;
	}

	# Adding the cache control header for js and css files
	# Make sure it is BELOW the PHP block
	location ~ \.(?:css|js|woff|svg|gif)$ {
		try_files $uri /index.php$request_uri;
		add_header Cache-Control "public, max-age=15778463";
		# Add headers to serve security related headers  (It is intended
		# to have those duplicated to the ones above)
		add_header Strict-Transport-Security "max-age=15768000; includeSubDomains;";
		add_header X-Content-Type-Options nosniff;
		add_header X-XSS-Protection "1; mode=block";
		add_header X-Robots-Tag none;
		add_header X-Download-Options noopen;
		add_header X-Permitted-Cross-Domain-Policies none;
		add_header Referrer-Policy no-referrer;

		# Optional: Don't log access to assets
		access_log off;
	}

	location ~ \.(?:png|html|ttf|ico|jpg|jpeg)$ {
		try_files $uri /index.php$request_uri;
		# Optional: Don't log access to other assets
		access_log off;
	}

	location ^~ /.well-known/acme-challenge/ {
		default_type "text/plain";
		try_files $uri =404;
	}

# Redirect Cal/CardDAV requests to Nextcloud endpoint:
	location = /.well-known/carddav {
		return 301 $scheme://$host/remote.php/dav;
	}
	location = /.well-known/caldav {
		return 301 $scheme://$host/remote.php/dav;
	}

} # end of https server block


# http redirect part (80 is not exposed to the internet right now)
server {
    if ($host = cloud.name.tld) {
        return 301 https://$host$request_uri;
    } # managed by Certbot

	listen 80 default_server;

	server_name cloud.name.tld;
    return 404; # managed by Certbot
}

Attemted solutions:

basically I went from the default DietPi installation and followed all the recommendations - admin/overview shows :white_check_mark: All checks passed.

php

memory_limit 512M
max_execution_time 3600
max_file_uploads 20
max_input_time 3600
post_max_size 8796093022207M
upload_max_filesize 8796093022207M
upload_tmp_dir /var/tmp/php_upload_tmp // is writing to those
session.save_path /mnt/ssd/php/sessions // without problems

nginx

client_max_body_size 1048576M;
fastcgi_buffers 64 4K;
client_body_temp_path /mnt/ssd/nginx/tmp/; # doesn't really matter because of
fastcgi_request_buffering off;         # <- no request buffering
fastcgi_read_timeout 600s;

I also migrated the installation from mariadb to postgresql because of some log messages that the mysql server had disappeared (crashed?). But it didn’t improve and maybe even slightly worsen the issue.

Maybe some other things I have already forgotten by now…

uploading individual files > 2GB is possible without issues, so just the size is not the problem.

I’ve searched the web up and down - for now I’m at my wits’ end. Any helpful ideas or suggestions are very welcome. Let me know if you need more information of my setup.

I can tell you that I’ve dropped several MP4s on mine at once through the web interface totaling more data than that and had no problems, so I would say it’s not a general issue with Nextcloud.

My setup is very different than yours though, so it’s possible some other factor is causing the issue.

Thanks, hmm :thinking:

I also have some other issue on this install since the beginning but it is only a minor annoyance and a pain to debug so I haven’t figured it out yet. And I wouldn’t think it is connected to the problem of this tread but it might hint at a problem with my nginx config: Whenever a newly created user logs in for the very first time (and only then) they will not log in and run into a 404 page. Any following login attempt works.

(I now added the nginx config file to the OP but it’s also not very different from the standard config)

I have no idea about that as I don’t use nginx. I can describe my setup, but I’m not sure it will help as almost every part is different than your setup.

I run VMware ESXi 6.5 with a pfSense VM and an Ubuntu 18.04 VM for Nextcloud. I’m running nextcloud:latest in Docker-compose rebuilt from the “full” example (all optional components except Libreoffice). Docker and compose installed from official instructions, not using snap or standard repos. Other Docker containers running with it are Collabora, Coturn, Redis, and MariaDB.

The Ubuntu VM is in a different vSwitch in ESXi which is firewalled off from my home LAN by pfSense (a DMZ running on 3rd interface). And that host is also running Apache for a reverse proxy and certbot.

I also run DNS and DDNS on pfSense so I have split DNS for Nextcloud and Collabora. And I block connections to Nextcloud from non-ARIN addresses, as well as running dummy catchall vhosts in Apache for when the proper FQDN isn’t used. Those two things drastically reduce scans against the Nextcloud vhost.

Thanks Karl,

I can testify that the described problem goes away when running essentially the same configuration on a Raspberry Pi 4.

On the Pi the upload finishes but when it comes to “processing files” it spits out a Error when assembling chunks, status code 404 - it seems like php can’t locate the chunked files anymore, spamming fseek() expects parameter 2 to be integer, float given in the log. Probably because the numbers are bigger than the PHP_INT_MAX which is only 2147483647 with the arm7l 32bit architecture. Bummer, I wasn’t aware of this when getting the Pi 4, naively thinking everything new is 64 bit these days.

With little hope of fixing this (hardware)issue I actually migrated back again to the Rock64, because I had some more minor changes in between, hoping that they might have brought some remedy - but now the original error is back and persistent as ever, I’m basically back where I started (although now on NC 17)

it’s a shame, for me one of the most attractive features of nextcloud is the ability to run on very low-power devices (if you don’t need to serve 1000s of users) but it has turned out to be massively frustrating and time consuming

I’m pretty much out of ideas - looks like I will have to give up, or make yet another investment in something significantly more powerful

I was mistaken about the Raspberry Pi 4 cpu architecture - it does have the ARMv8 instruction set, it is only most debian based distros for the Pi 4 that still employ a 32 bit kernels.

I was able to solve my problems starting from scratch with the Raspberry Pi 4 minimal ARM edition of Manjaro. Porting over my existing nextcloud installation and data was a long and painful process (damn you silent errors) but finally it came around and now is running about as well as one could hope from such economic hardware.

For anyone facing problems on a similar setup I can only recommend to go through the official server tuning tips, all of them, step by step. Most impactful for me was to scale up the number of php-fpm child processes. When doing that it may also be advisable to raise the maximum simultaneous connections to your DBMS (in my case postgresql where 100 connections were exceeded in some circumstances (by a single user!) - resulting in an almost completely blocked system with maxed out RAM and swap)

In some places you may encounter the opinion that 64 bit kernel yields no practical benefits on a SBC like the Pi 4. I strongly disagree - it may be a flaw in nextcloud that it does not fully account for the limitations of 4byte integers in it’s software design, but this will only become more normal as time goes on - 64bit is the way to go.