SMB Authentication process overloading server


#1

I have a setup with a Ubuntu server hosting the nextcloud instance. Users are mapped to remote samba shares through the nextcloud instance. I periodically log in to check the server and see that it’s over loaded. When I look at what is causing the overload there are a bunch of processes like this:
/usr/bin/smbclient --authentication-file=/proc/self/fd/3/IP-address/share

I have attached a screen grab from webmin showing this. This particular one was only slightly overloaded. I have seen the one minute load average over 100.

I have had this problem over many versions of owncloud and nextcloud. I am currently running Nextcloud 12.0.0

I discovered something that seemed to help this issue last week. I had a users computer and was able to monitor the sync process and when the server began overloading. After the sync client did it’s compare and began syncing the files that’s when the server load shot up. I found that by manually limiting the bandwidth on the sync client I was able to eliminate (I thought) the overloading and actually sped up the sync process. My thought was that since the connection of the syncing machine to my server is fast and the connection to the remote machine with the actual files is relatively slow. 250-500 Mbps vs 10 Mbps That reducing the bandwidth to about what the remote storage machine bandwidth was would keep the processes in check. Seemed to work for a while till I returned the machine to the user. I have one other computer connecting to this share and reduced the bandwidth settings on it as well.

I am looking for any ideas as to how I can stop these processes from bogging down my server. Currently I check in and either restart Apache or if the number is small enough go through and kill each of the offending processes. But they can start up at anytime so I have to check in several times a day to keep on top of them. Also I don’t know if this is keeping the machines from getting synced. I do run a cron job that does an occ scan a couple times a day to be sure Nextcloud knows what’s on the file share.

Nextcloud version (eg, 12.0.0):
Operating system and version (eg, Xbuntu 16.04):
Apache version (eg, Apache 2.4.18):
PHP version (eg, 7.0):

In my Nextcloud/Admin/Log there are a number of these errors.

Fatal	webdav	Sabre\DAV\Exception\ServiceUnavailable: HTTP/1.1 503 Storage with mount id 1 is not available
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/Tree.php - line 76: OCA\DAV\Connector\Sabre\Directory->getChild('common')
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/Server.php - line 966: Sabre\DAV\Tree->getNodeForPath('files/stacyc/co...')
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/Server.php - line 1665: Sabre\DAV\Server->getPropertiesIteratorForPath('files/stacyc/co...', Array, 1)
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/CorePlugin.php - line 355: Sabre\DAV\Server->generateMultiStatus(Object(Generator), false)
[internal function] Sabre\DAV\CorePlugin->httpPropFind(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
/var/www/html/bls/3rdparty/sabre/event/lib/EventEmitterTrait.php - line 105: call_user_func_array(Array, Array)
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/Server.php - line 479: Sabre\Event\EventEmitter->emit('method PROPFIND', Array)
/var/www/html/bls/3rdparty/sabre/dav/lib/DAV/Server.php - line 254: Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
/var/www/html/bls/apps/dav/lib/Server.php - line 253: Sabre\DAV\Server->exec()
/var/www/html/bls/apps/dav/appinfo/v2/remote.php - line 33: OCA\DAV\Server->exec()
/var/www/html/bls/remote.php - line 162: require_once('/var/www/html/b...')
{main}

This is the only errors showing in my Apache Log:

[Fri Jul 28 07:35:09.779827 2017] [mpm_prefork:notice] [pid 9662] AH00163: Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g configured -- resuming normal operations
[Fri Jul 28 07:35:09.779844 2017] [core:notice] [pid 9662] AH00094: Command line: '/usr/sbin/apache2'
mkdir failed on directory /var/run/samba/msg.lock: Permission denied
mkdir failed on directory /var/run/samba/msg.lock: Permission denied

My Nextcloud Config file:

<?php
$CONFIG = array (
  'instanceid' => 'XXXXXXXX',
  'passwordsalt' => 'XXXXXXXXXXXX',
  'secret' => 'xxxxxxxxxxxxxxxx',
  'trusted_domains' => 
  array (
    0 => 'mydomain.com',
  ),
  'datadirectory' => '/mnt/oc_storage/',
  'overwrite.cli.url' => 'https://mydomain.com',
  'dbtype' => 'mysql',
  'version' => '12.0.0.29',
  'dbname' => 'oc_XXX',
  'dbhost' => 'localhost',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'oc_xxxxxxxx',
  'dbpassword' => 'xxxxxxxxxxxxxxxxxxxxxxxx',
  'logtimezone' => 'UTC',
  'installed' => true,
  'mail_smtpmode' => 'smtp',
  'mail_from_address' => 'xxxxxxxxxxxxxx',
  'mail_domain' => 'xxxxxxxxxxxxxx',
  'mail_smtpauthtype' => 'PLAIN',
  'mail_smtpauth' => 1,
  'mail_smtphost' => 'xxxxxxxxxxxxxxxxxx',
  'mail_smtpport' => '465',
  'mail_smtpname' => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  'mail_smtppassword' => 'xxxxxxxxxxxxxxxxxxxxxxxxx',
  'mail_smtpsecure' => 'ssl',
  'theme' => '',
  'loglevel' => 2,
  'maintenance' => false,
  'updater.release.channel' => 'production',
  'trashbin_retention_obligation' => '15',
  'memcache.local' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => 'localhost',
    'port' => 6379,
  ),
  'memcache.locaking' => '\\OC\\Memcache\\Redis',
);

#2

I seem to be having a similar problem. Did you ever get this resolved?


#3

Still have the problem once and a while. I did find that on the client computer with the sync client running if I throttled the bandwidth it could use it seemed to stop the problem. I think it has to do with the relative slow connection to the smb server compared to the web connection. But that’s just my logic from playing around with the controls I can control.


#4

I have the exact same issue.

I deployed Nextcloud 15.0.2 in a Docker, I have my NAS in a Docker too on the same network (a simple Ubuntu 18.04 with folders shared via samba) so the network link between the 2 can’t be too slow.

This process can take up to 100% of one CPU core and the folder access time is really slow.

Weirdly it happens only on 1 folder from this NAS, the others are just fine. Quick access time and no process that take a lot of ressource. I’m a bit confused since the configuration for each folder is strictly the same.


#5

Same exact issue, version 15.0.5, also deployed using docker. It also spawn a process using 100% cpu for each core I assign to the docker VM.