I’m getting the not so uncommon error when uploading large files trough the web interface:
Error when assembling chunks, status code 504
I’m a developer and would like to figure out what is actually going on here. What is Nextcloud actually doing in step 4 below?
How common is this for large file uploads? Does this happen to everyone that have <= 1min timeouts and doing big enough file uploads?
The course of events
No other load is on the server at the time.
- Uploading a 5GB file through the web interface
- File is split in 10MB chunks/parts that are uploaded one at a time
- HTTP
MOVE .file
request is made at the end of the upload of all of the parts - The PHP process (nginx FPM) handling the request is using about 100% CPU of one core (that may include IO wait).
- Can’t see any significant CPU load on the data base or anything else during this time.
- There is IO going on tho. But the disks do not seem to be 100% busy.
- 504 timeout happens (after 1 minute in my case)
- The PHP process/request is stopped at this point without finishing
- The uploaded file is not in the destination directory, all the upload segment files are still in the upload directory
- Reloading the Nextcloud web interface shows the uploaded file (this takes at most a couple of seconds)
- Part files are now gone and the destination file is created
Questions and guesses
What is Nextcloud doing at step 4 for >1 minute of 100% CPU usage until the request is killed? It doesn’t seem to be writing to the destination file as it does not exist when the request is killed.
Whatever it’s doing seems useless as a simple refresh at step 8 creates the complete 5GB file and removes the part files in a few seconds.
My first guess would be that 4 is just writing the chunks to the destination file. And that seems resonable (at least for a suboptimal architecture that forces multiple FS writes of the uploaded data).
But that shouldn’t cause that much CPU load? (Unless IO wait is included in that metric I guess? I’m not that familiar with FreeBSD.)
But that a page reload almost instantly creates the target file (on the file system and in the web UI) seems to contradict that so much time is needed for that action.
Solution by increasing the timeouts
Increasing the timeouts in nginx and fpm solves this for me, but it doesn’t answer the question of what’s actually is going on, which is what I would like to find out.
- Add this to your PHP FPM config:
request_terminate_timeout = 600
- Add this to your nginx config:
fastcgi_read_timeout 600;
System info
Nextcloud 19.0.1
PHP: 7.4.8 (with 1GB mem limit)
Nginx: 1.18.0
OS: FreeNAS 11.3 U3.2
ZFS on mechanical drives (~100MB+ in seq write speeds)
Config:
<?php
$CONFIG = array (
'apps_paths' =>
array (
0 =>
array (
'path' => '/usr/local/www/nextcloud/apps',
'url' => '/apps',
'writable' => true,
),
1 =>
array (
'path' => '/usr/local/www/nextcloud/apps-pkg',
'url' => '/apps-pkg',
'writable' => true,
),
),
'logfile' => '/var/log/nextcloud/nextcloud.log',
'memcache.local' => '\\OC\\Memcache\\APCu',
'passwordsalt' => 'xxx',
'secret' => 'xxx',
'trusted_domains' =>
array (
0 => 'localhost',
1 => '192.168.1.12',
2 => 'xxx',
),
'datadirectory' => '/mnt/data',
'dbtype' => 'mysql',
'version' => '19.0.1.1',
'overwrite.cli.url' => 'http://localhost',
'dbname' => 'nextcloud',
'dbhost' => 'localhost',
'dbport' => '',
'dbtableprefix' => 'oc_',
'mysql.utf8mb4' => true,
'dbuser' => 'oc_ncadmin',
'dbpassword' => 'xxx',
'installed' => true,
'instanceid' => 'xxx',
'maintenance' => false,
'theme' => '',
'loglevel' => 2,
# Added for testing of this issue (but it didn't seem to have any effect)
'filelocking.enabled' => false,
'part_file_in_storage' => false,
);