Uploading large files fails even with desktop sync client


#1

Nextcloud version: 11.0.2
Operating system and version: Fedora 25
Apache or nginx version: 2.4.25
PHP version: 7.0.17
Is this the first time you’ve seen this error?: No

Can you reliably replicate it? (If so, please outline steps): Yes

The issue you are facing:
Even though I have done everything in Uploading big files > 512MB,
uploading large files constantly fails. This also happens with the desktop sync client.

Steps to reproduce:

  1. Get a large file, preferably > 4 GB. I’ve had files around 50MB fail as well though.
  2. Select “No limit” under upload speed settings in desktop client. This isn’t necessary but triggers the behavior faster.
  3. Make sure your desktop client is in same network as your server. I haven’t been able to reproduce this over the internet. I think because the upload speed isn’t fast enough.
  4. File upload will fail. Desktop sync will report Error while reading: error:140943FC:SSL routines:ssl3_read_bytes:sslv3 alert bad record mac
  5. Server logs will report HTTP 400 expected filesize errors (pasted below)

It seems that uploads fail when uploaded at high speeds.

The output of your Nextcloud log in Admin > Logging:
`
Apr 04 22:31:25 benxiao-server01 ownCloud[12438]: {webdav} Exception: {“Message”:“HTTP/1.1 400 expected filesize 10000000 got 6356992”,“Exception”:“Sabre\DAV\Exception\BadRequest”,“Code”:0,“Trace”:"#0 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php(104): OCA\DAV\Connector\Sabre\File->createFileChunked(Resource id #469)\n#1 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/Directory.php(137): OCA\DAV\Connector\Sabre\File->put(Resource id #469)\n#2 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(1072): OCA\DAV\Connector\Sabre\Directory->createFile(‘The.Machinist.2…’, Resource id #469)\n#3 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/CorePlugin.php(525): Sabre\DAV\Server->createFile(‘The.Machinist.2…’, Resource id #469, NULL)\n#4 [internal function]: Sabre\DAV\CorePlugin->httpPut(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#5 /var/www/html/nextcloud/3rdparty/sabre/event/lib/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\n#6 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(479): Sabre\Event\EventEmitter->emit(‘method:PUT’, Array)\n#7 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(254): Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#8 /var/www/html/nextcloud/apps/dav/appinfo/v1/webdav.php(60): Sabre\DAV\Server->exec()\n#9 /var/www/html/nextcloud/remote.php(165): require_once(’/var/www/html/n…’)\n#10 {main}",“File”:"/var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php",“Line”:405,“User”:“benxiao”}

Apr 04 22:31:25 benxiao-server01 ownCloud[12203]: {webdav} Exception: {“Message”:“HTTP/1.1 400 expected filesize 10000000 got 5701632”,“Exception”:“Sabre\DAV\Exception\BadRequest”,“Code”:0,“Trace”:"#0 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php(104): OCA\DAV\Connector\Sabre\File->createFileChunked(Resource id #469)\n#1 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/Directory.php(137): OCA\DAV\Connector\Sabre\File->put(Resource id #469)\n#2 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(1072): OCA\DAV\Connector\Sabre\Directory->createFile(‘The.Machinist.2…’, Resource id #469)\n#3 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/CorePlugin.php(525): Sabre\DAV\Server->createFile(‘The.Machinist.2…’, Resource id #469, NULL)\n#4 [internal function]: Sabre\DAV\CorePlugin->httpPut(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#5 /var/www/html/nextcloud/3rdparty/sabre/event/lib/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\n#6 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(479): Sabre\Event\EventEmitter->emit(‘method:PUT’, Array)\n#7 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(254): Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#8 /var/www/html/nextcloud/apps/dav/appinfo/v1/webdav.php(60): Sabre\DAV\Server->exec()\n#9 /var/www/html/nextcloud/remote.php(165): require_once(’/var/www/html/n…’)\n#10 {main}",“File”:"/var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php",“Line”:405,“User”:“benxiao”}

Apr 04 22:31:25 benxiao-server01 ownCloud[12188]: {webdav} Exception: {“Message”:“HTTP/1.1 400 expected filesize 10000000 got 5570560”,“Exception”:“Sabre\DAV\Exception\BadRequest”,“Code”:0,“Trace”:"#0 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php(104): OCA\DAV\Connector\Sabre\File->createFileChunked(Resource id #469)\n#1 /var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/Directory.php(137): OCA\DAV\Connector\Sabre\File->put(Resource id #469)\n#2 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(1072): OCA\DAV\Connector\Sabre\Directory->createFile(‘The.Machinist.2…’, Resource id #469)\n#3 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/CorePlugin.php(525): Sabre\DAV\Server->createFile(‘The.Machinist.2…’, Resource id #469, NULL)\n#4 [internal function]: Sabre\DAV\CorePlugin->httpPut(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#5 /var/www/html/nextcloud/3rdparty/sabre/event/lib/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\n#6 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(479): Sabre\Event\EventEmitter->emit(‘method:PUT’, Array)\n#7 /var/www/html/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php(254): Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))\n#8 /var/www/html/nextcloud/apps/dav/appinfo/v1/webdav.php(60): Sabre\DAV\Server->exec()\n#9 /var/www/html/nextcloud/remote.php(165): require_once(’/var/www/html/n…’)\n#10 {main}",“File”:"/var/www/html/nextcloud/apps/dav/lib/Connector/Sabre/File.php",“Line”:405,“User”:“benxiao”}
`

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):
"system": { "instanceid": "ocpleicxwygq", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "benxiao.me" ], "datadirectory": "\/var\/www\/html\/nextcloud\/data", "overwrite.cli.url": "https:\/\/benxiao.me\/nextcloud", "dbtype": "mysql", "version": "11.0.2.7", "dbname": "owncloud_db", "dbhost": "localhost", "dbtableprefix": "oc_", "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "mail_smtpmode": "smtp", "mail_from_address": "REMOVED", "mail_domain": "gmail.com", "mail_smtpsecure": "tls", "mail_smtpauthtype": "LOGIN", "mail_smtphost": "smtp.gmail.com", "mail_smtpport": "587", "theme": "", "maintenance": false, "log_type": "syslog", "logfile": "", "loglevel": 2, "appstore.experimental.enabled": false, "trashbin_retention_obligation": "auto", "htaccess.RewriteBase": "\/nextcloud", "updater.release.channel": "stable", "mail_smtpauth": 1, "mail_smtpname": "***REMOVED SENSITIVE VALUE***", "mail_smtppassword": "***REMOVED SENSITIVE VALUE***" },


#2

Anyone? This is really making Nextcloud unusable for me and I want to fix this.


#3

From this line it looks like the MAC address changed during the communication. Is there any load-balancing or proxy in between?


#5

No, this is just a personal machine at home. No load-balancer or proxy. I did read that Fedora was thinking of implementing some protection for MAC addresses. I wonder if that’s got anything to do with it. How do I verify that the MAC is indeed changing?


#6

OK it looks like I have made a mistake. It isn’t the MAC as in MAC address, but MAC as in Message Authentication Code. This is part of the authentication mechanism of SSL to authenticate the message between the client and host.

Now the question is how this got corrupted during the SSL handshake.

First thing I can think of is are both the host and client in sync regarding time. Are both using NTP (or other mechanism) to get the correct time.

Second thing I can think of is certificate issues. Where the server (most likely) is using a certificate with an incorrect private key. From some digging I came to this message:

the client is using a purported server public key which does not match the actual server private key. Normally, the client will extract the server public key from the server certificate, which the server sends to the client during the handshake. The server also owns a private key, which is mathematically linked with the public key. If the server is configured to use the wrong file, then you could obtain the symptoms you observe.

I suggest that you use a network monitoring tool (such as Wireshark) to observe the handshake messages: you should see the Certificate message from the server, containing the server certificate chain; for the first certificate of that chain, you can extract the server public key, that the client will use. On the server, try to print out the server private key details (whether this is easy or not depends on the involved software and actual key storage), to see if they match.

Origin: https://security.stackexchange.com/questions/39844/getting-ssl-alert-write-fatal-bad-record-mac-during-openssl-handshake

Maybe look into this.


#7

Hi StephanW

First I’d like to thank you for your help. This problem has been frustrating me a long time and I am almost about to switch to something else, but I love Nextcloud’s features so much that its hard.

As for your suggestions, both machines are synced up to NTP.

For certificates, I am using Let’s Encrypt to generate certificates. I then put those keys into the apache config. The thing is, these certificates work for everything else. I have both my friends blog and my blog using Let’s Encrypt on the same httpd instance and they work fine. Also if it really was a certificate key mismatch issue, wouldn’t file transfers fail immediately because the connection won’t be setup? Right now, smaller files can go through. It seems like only long transfers fail. The likelihood of failure also rises the faster the connection speed is. Remote transfers seem to go through okay, but large transfers within the same network fail often.


#8

I am just wondering if there are packets being corrupted somewhere within the network. That with small files it doesn’t happen (as often) and therefore it works, but with larger files it has more risk of getting corrupted and causing this issue. A blog normally just contains small (HTML) files and pictures, which wouldn’t trigger it. Just thinking out loud to find a reason what it could be.


#9

I used wireshark to capture some packets. I am no network expert, but it seems like the initial handshake goes through and it starts sending the data over. But is it normal to get a lot of packets marked as Ignored Unknown Record?


#10

That doesn’t sound correct.

Can you check if the following setting is enabled in Wireshark:

Select Edit > Preferences > Protocols > TCP > check Allow Subdissector to Reassemble TCP Streams

And see if that makes a difference


#11

It is already checked.


#12

So I thought I’d try using a different network card to see if it was a hardware issue. I had a spare 1000 Ethernet to USB adapter lying around and plugged it into my Intel NUC (my server). I did another long transfer and it gets much further with this adapter than with the built in ethernet card, but alas it still disconnects after a while.


#13

OK I might sound stupid, so forgive me if you already checked this, but have you increased the PHP execution timeout and setup Apache keep-alive timer?


#14

Not stupid at all. I need a fresh set of eyes on this so feel free :slight_smile:

So PHP execution timeout is already set at 3600.
Have not looked at Apache keep-alive timer, but my initial reaction would be that this isn’t it because it works when I transfer a file remotely over the Internet. Slower speed but eventually completes. It breaks when I am transferring at home where my server is.

But which keep-alive parameter are you referring to? I can play around with that setting.

Also, seems like Wordpress is also breaking with large uploads so I was wrong about it working for other things. Seems like a general configuration issue with Apache or PHP somewhere but I just don’t know what’s causing the connection to terminate. The logs are stupidly terse when this happens.


#15

The timer I am referring to is in /etc/apache2/apache2.conf called:

# KeepAliveTimeout: Number of seconds to wait for the next request from the
KeepAliveTimeout 5

The setting you see here is taken from my config file and I don’t have any problems with large files external and internal. This is from the Apache2 setup on NextCloud itself (just in case you have a proxy in front for public access).


#16

KeepAliveTimeout isn’t specified in my main apache conf. I am on Fedora so /etc/httpd/conf/httpd.conf. I’ve searched in my other confs and it isn’t specified either. According to the docs, the default for this value is 5, so I think I have the same config as you.


#17

It is specially weird that it only happens when being in the local network, and not when you are external. I am trying to break my head around it what it could be but at this moment bit clueless.


#18

So I tried disabling SSL completely on my Nextcloud instance and it seems to work fine. So there’s definitely something wrong with using encryption. I am just not sure what. Any pointers? Maybe I am not configuring my SSL correctly?


#19

Here is my Apache config for NextCloud (proxy), see if anything is useful

<VirtualHost *:443>

        ServerName cloud.<domain.tld>

        #   SSL Engine Switch:
        #   Enable/Disable SSL for this virtual host.
        SSLEngine on
        SSLProtocol all -SSLv2 -SSLv3
        SSLHonorCipherOrder on
        SSLCipherSuite "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS !RC4"

        SSLProxyVerify none
        SSLProxyCheckPeerCN off
        SSLProxyCheckPeerName off

        SSLCertificateFile /etc/letsencrypt/live/<domain.tld>/fullchain.pem
        SSLCertificateKeyFile /etc/letsencrypt/live/<domain.tld>/privkey.pem

        ProxyPass / http://<internal IP>:8080/
        ProxyPassReverse / http://<internal IP>:8080/

        ProxyVia On
        ProxyPreserveHost On
        RequestHeader set X-Forwarded-Proto 'https' env=HTTPS

</VirtualHost>

#20

Here is mine.

<VirtualHost _default_:443>

# General setup for the virtual host, inherited from global configuration
#DocumentRoot "/var/www/html"
ServerName <domain>:443

# Use separate log files for the SSL virtual host; note that LogLevel
# is not inherited from httpd.conf.
ErrorLog logs/ssl_error_log
TransferLog logs/ssl_access_log
LogLevel warn

#   SSL Engine Switch:
#   Enable/Disable SSL for this virtual host.
SSLEngine on

#   List the protocol versions which clients are allowed to connect with.
#   Disable SSLv3 by default (cf. RFC 7525 3.1.1).  TLSv1 (1.0) should be
#   disabled as quickly as practical.  By the end of 2016, only the TLSv1.2
#   protocol or later should remain in use.
SSLProtocol all -SSLv3
SSLProxyProtocol all -SSLv3

#   User agents such as web browsers are not configured for the user's
#   own preference of either security or performance, therefore this
#   must be the prerogative of the web server administrator who manages
#   cpu load versus confidentiality, so enforce the server's cipher order.
SSLHonorCipherOrder on

#   SSL Cipher Suite:
# List the ciphers that the client is permitted to negotiate.
# See the mod_ssl documentation for a complete list.
# The OpenSSL system profile is configured by default.  See
# update-crypto-policies(8) for more details.
SSLCipherSuite PROFILE=SYSTEM
SSLProxyCipherSuite PROFILE=SYSTEM

#   Server Certificate:
# Point SSLCertificateFile at a PEM encoded certificate.  If
# the certificate is encrypted, then you will be prompted for a
# pass phrase.  Note that a kill -HUP will prompt again.  A new
# certificate can be generated using the genkey(1) command.
SSLCertificateFile /etc/letsencrypt/live/<domain>/fullchain.pem

#   Server Private Key:
#   If the key is not combined with the certificate, use this
#   directive to point at the key file.  Keep in mind that if
#   you've both a RSA and a DSA private key you can configure
#   both in parallel (to also allow the use of DSA ciphers, etc.)
SSLCertificateKeyFile /etc/letsencrypt/live/<domain>/privkey.pem

#   Server Certificate Chain:
#   Point SSLCertificateChainFile at a file containing the
#   concatenation of PEM encoded CA certificates which form the
#   certificate chain for the server certificate. Alternatively
#   the referenced file can be the same as SSLCertificateFile
#   when the CA certificates are directly appended to the server
#   certificate for convinience.
#SSLCertificateChainFile /etc/pki/tls/certs/server-chain.crt

#   Certificate Authority (CA):
#   Set the CA certificate verification path where to find CA
#   certificates for client authentication or alternatively one
#   huge file containing all of them (file must be PEM encoded)
#SSLCACertificateFile /etc/pki/tls/certs/ca-bundle.crt

#   Client Authentication (Type):
#   Client certificate verification type and depth.  Types are
#   none, optional, require and optional_no_ca.  Depth is a
#   number which specifies how deeply to verify the certificate
#   issuer chain before deciding the certificate is not valid.
#SSLVerifyClient require
#SSLVerifyDepth  10

#   Access Control:
#   With SSLRequire you can do per-directory access control based
#   on arbitrary complex boolean expressions containing server
#   variable checks and other lookup directives.  The syntax is a
#   mixture between C and Perl.  See the mod_ssl documentation
#   for more details.
#<Location />
#SSLRequire (    %{SSL_CIPHER} !~ m/^(EXP|NULL)/ \
#            and %{SSL_CLIENT_S_DN_O} eq "Snake Oil, Ltd." \
#            and %{SSL_CLIENT_S_DN_OU} in {"Staff", "CA", "Dev"} \
#            and %{TIME_WDAY} >= 1 and %{TIME_WDAY} <= 5 \
#            and %{TIME_HOUR} >= 8 and %{TIME_HOUR} <= 20       ) \
#           or %{REMOTE_ADDR} =~ m/^192\.76\.162\.[0-9]+$/
#</Location>

#   SSL Engine Options:
#   Set various options for the SSL engine.
#   o FakeBasicAuth:
#     Translate the client X.509 into a Basic Authorisation.  This means that
#     the standard Auth/DBMAuth methods can be used for access control.  The
#     user name is the `one line' version of the client's X.509 certificate.
#     Note that no password is obtained from the user. Every entry in the user
#     file needs this password: `xxj31ZMTZzkVA'.
#   o ExportCertData:
#     This exports two additional environment variables: SSL_CLIENT_CERT and
#     SSL_SERVER_CERT. These contain the PEM-encoded certificates of the
#     server (always existing) and the client (only existing when client
#     authentication is used). This can be used to import the certificates
#     into CGI scripts.
#   o StdEnvVars:
#     This exports the standard SSL/TLS related `SSL_*' environment variables.
#     Per default this exportation is switched off for performance reasons,
#     because the extraction step is an expensive operation and is usually
#     useless for serving static content. So one usually enables the
#     exportation for CGI and SSI requests only.
#   o StrictRequire:
#     This denies access when "SSLRequireSSL" or "SSLRequire" applied even
#     under a "Satisfy any" situation, i.e. when it applies access is denied
#     and no other module can change it.
#   o OptRenegotiate:
#     This enables optimized SSL connection renegotiation handling when SSL
#     directives are used in per-directory context. 
#SSLOptions +FakeBasicAuth +ExportCertData +StrictRequire
<Files ~ "\.(cgi|shtml|phtml|php3?)$">
    SSLOptions +StdEnvVars
</Files>
<Directory "/var/www/cgi-bin">
    SSLOptions +StdEnvVars
</Directory>

#   SSL Protocol Adjustments:
#   The safe and default but still SSL/TLS standard compliant shutdown
#   approach is that mod_ssl sends the close notify alert but doesn't wait for
#   the close notify alert from client. When you need a different shutdown
#   approach you can use one of the following variables:
#   o ssl-unclean-shutdown:
#     This forces an unclean shutdown when the connection is closed, i.e. no
#     SSL close notify alert is send or allowed to received.  This violates
#     the SSL/TLS standard but is needed for some brain-dead browsers. Use
#     this when you receive I/O errors because of the standard approach where
#     mod_ssl sends the close notify alert.
#   o ssl-accurate-shutdown:
#     This forces an accurate shutdown when the connection is closed, i.e. a
#     SSL close notify alert is send and mod_ssl waits for the close notify
#     alert of the client. This is 100% SSL/TLS standard compliant, but in
#     practice often causes hanging connections with brain-dead browsers. Use
#     this only for browsers where you know that their SSL implementation
#     works correctly. 
#   Notice: Most problems of broken clients are also related to the HTTP
#   keep-alive facility, so you usually additionally want to disable
#   keep-alive for those clients, too. Use variable "nokeepalive" for this.
#   Similarly, one has to force some clients to use HTTP/1.0 to workaround
#   their broken HTTP/1.1 implementation. Use variables "downgrade-1.0" and
#   "force-response-1.0" for this.
BrowserMatch "MSIE [2-5]" \
         nokeepalive ssl-unclean-shutdown \
         downgrade-1.0 force-response-1.0

#   Per-Server Logging:
#   The home of a custom SSL log file. Use this when you want a
#   compact non-error SSL logfile on a virtual host basis.
CustomLog logs/ssl_request_log \
          "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"

</VirtualHost>

I should add that there’s 2 other virtualhosts on this machine with the exact same configuration (different keys of course).


#21

You could always try to disable few bits, for example the BrowserMatch section, to see if it makes any difference. That way you could eliminate what is the problem.

I would disable as much as possible, and then for testing purposes, retry. And each time enable a new section of the VirtualHost.