New Client with Virtual Filesystem cannot sync huge folders

I have setup a new computer to test the virtual file system because migrating my existing NC installation on my desktop was causing other problems so I wanted to test it from scratch.
On my server I have an folder “Photos” which has about 300 GB and almost 60.000 Files sorted by Year.
When I now connect the new computer to my NC server, it starts scanning all the folders. The Photos folder cannot be scanned completey, everytime it pops up with a new error message, sometimes with an “error” with a file in 2011, sometimes with an file in 2017, never the same file.

Then it starts scanning again from scratch so I am in a loop.

I use NC21.0.1 and Win10 Pre x64 with Desktop Client 3.2.0
Are there any known limitations for the new file system feature?

Hi, this sounds like a bug. Please open a new issue for this in the desktop client repository after making sure that no such bug report already exists. GitHub - nextcloud/desktop: đź’» Desktop sync client for Nextcloud
Afterwards, please post the link of the issue here that covers this bug, either way.

Seem to be a known bug:

I just run the tests now on 3 computers having the 3.2.0 client installed as 32bit and 64bit version:
The syncronization never finishes when enabling Virtual Drive Support. When disabling it, it goes through all the folder, and finally tells me: Syncronized.

So it must be a bug of either the VFS feature or the 3.2 Client.

My server error logfile shows:
[Mon Apr 19 07:19:21.377363 2021] [proxy_fcgi:error] [pid 12740:tid 140270704207616] (70007)The timeout specified has expired: [client 192.168.10.125:64823] AH01075: Error dispatching request to : (polling)
[Mon Apr 19 07:19:23.402128 2021] [proxy_fcgi:error] [pid 12740:tid 140270720993024] (70007)The timeout specified has expired: [client 192.168.10.125:64816] AH01075: Error dispatching request to : (polling)
[Mon Apr 19 07:19:24.374095 2021] [proxy_fcgi:error] [pid 12740:tid 140270695814912] (70007)The timeout specified has expired: [client 192.168.10.125:64825] AH01075: Error dispatching request to : (polling)
[Mon Apr 19 07:19:29.714338 2021] [proxy_fcgi:error] [pid 12740:tid 140270729385728] (70007)The timeout specified has expired: [client 192.168.10.125:64827] AH01075: Error dispatching request to : (polling)

Is there any known bug that a folder with 60.000 files in several subfoldes will cause Nextcloud to abort the sync? It seems that the client is running into a timeout, because in the client logfile I can find:

2021-04-19 14:44:40:382 [ debug nextcloud.sync.database.sql ]	[ OCC::SqlQuery::exec ]:	SQL exec "SELECT lastTryEtag, lastTryModtime, retrycount, errorstring, lastTryTime, ignoreDuration, renameTarget, errorCategory, requestId FROM blacklist WHERE path=?1 COLLATE NOCASE"
2021-04-19 14:44:40:382 [ info sync.discovery ]:	STARTING "FOTOS/2019/2019_08_09_Silas_Jonas" OCC::ProcessDirectoryJob::NormalQuery "FOTOS/2019/2019_08_09_Silas_Jonas" OCC::ProcessDirectoryJob::ParentDontExist
2021-04-19 14:44:40:382 [ info nextcloud.sync.accessmanager ]:	6 "PROPFIND" "https://servername/nextcloud/remote.php/dav/files/testuser/FOTOS/2019/2019_08_09_Silas_Jonas" has X-Request-ID "9a4f07cb-7f19-43df-9b09-784741fca4da"
2021-04-19 14:44:40:382 [ debug nextcloud.sync.cookiejar ]	[ OCC::CookieJar::cookiesForUrl ]:	QUrl("https://servername/nextcloud/remote.php/dav/files/testuser/FOTOS/2019/2019_08_09_Silas_Jonas") requests: (QNetworkCookie("nc_sameSiteCookielax=true; secure; HttpOnly; expires=Fri, 31-Dec-2100 23:59:59 GMT; domain=servername; path=/nextcloud"), QNetworkCookie("nc_sameSiteCookiestrict=true; secure; HttpOnly; expires=Fri, 31-Dec-2100 23:59:59 GMT; domain=servername; path=/nextcloud"), QNetworkCookie("oc_sessionPassphrase=K%2BWJoJAlcDrs1EUnolQCSTHYaSjx33OPBH9vC7YjFaGaVXSmSsq5gCnHbxdNQZYl9dCdjx5Qei6IVRM5i11ff7Okkb2QirI4TS1XXtjq%2Fz8QZeiF79fodEbMCyji9YHw; secure; HttpOnly; domain=servername; path=/nextcloud"), QNetworkCookie("ocut4lyy62j6=uu3tcdulfj1l82smb5k8qa0fgp; secure; HttpOnly; domain=servername; path=/nextcloud"))
2021-04-19 14:44:40:382 [ info nextcloud.sync.networkjob ]:	OCC::LsColJob created for "https://servername/nextcloud" + "/FOTOS/2019/2019_08_09_Silas_Jonas" "OCC::DiscoverySingleDirectoryJob"
2021-04-19 14:44:40:616 [ warning nextcloud.sync.networkjob ]:	Network job timeout QUrl("https://servername/nextcloud/remote.php/dav/files/testuser/FOTOS/100MEDIA")
2021-04-19 14:44:40:616 [ info nextcloud.sync.credentials.webflow ]:	request finished
2021-04-19 14:44:40:616 [ warning nextcloud.sync.networkjob ]:	QNetworkReply::OperationCanceledError "Connection timed out" QVariant(Invalid)
2021-04-19 14:44:40:616 [ warning nextcloud.sync.credentials.webflow ]:	QNetworkReply::OperationCanceledError
2021-04-19 14:44:40:616 [ warning nextcloud.sync.credentials.webflow ]:	"Operation canceled"

I changed in the following files the following parameters which sometimes seem to help a little bit, can maybe someone please test them, too?:

Add the following line into %appdata%\Nextcloud\nextcloud.cfg

chunkSize=268435456
timeout=600

On the Nextcloud Server modify the following parameters:

/etc/php/7.4/fpm/php.ini
post_max_size = 40M
upload_max_filesize = 40M
max_execution_time = 300
max_input_time = 600

/etc/php/7.4/apache2/php.ini
post_max_size = 40M
upload_max_filesize = 40M
max_execution_time = 300
max_input_time = 600

/etc/apache2/apache2.conf
Timeout 600
ProxyTimeout 600

Restart the Apache Webserver and the php service. Now restart the Nextcloud Desktop Client and see if the timeout is gone.

In my case the timeout is not there anymore, but the client now stucks at “Reconcilling changes” for hours:
grafik

12 hours later it stucks at
grafik

In the local logfile I did not get any error message.

Had the same problem yesterday with a 21.0.1 installation and 3.2.2 client - in total 60.000 files in 20 mainfolder (photos sorted by year) cannot be syncronized. Workaround was to select only maybe 3 or 4 mainfolder, do the local sync, when done add the next folders. When selecting the mainfolder with all files and subfolders the client will abort. So it could be a problem with an internal buffer overrun or whatever - the amount of files seems to be too huge - no matter if I use VFS or not.

Because it seems that no one knows the problem or knows how to test it - here is a windows cmd script which you can use to create huge folders easily. Jusst create a folder D:\TEMP, and place a small file named bild.jpg in it. Then you specify how many folder below D:\TEMP should be created, and how many subfoldes each should have. Last you specify how many copyied of the pictures every subfolder should have. Now you can create within a few minutes a folder with 100.000 items and lots of folders which can be used for testing.

@echo off
set Mainfolder=20
set subfolder=50
set count_files=100
set template_file=bild.jpg
set root_folder=D:\TEMP

REM Create Mainfolder
FOR /L %%i IN (1,1,%MAINFOLDER%) DO (
  cd /d %root_folder%
  mkdir Mainfolder_%%i
  cd Mainfolder_%%i

REM Create Subfolder in Current Mainfolder
	FOR /L %%k IN (1,1,%subfolder%) DO (
	mkdir Subfolder_%%k
	cd /d Subfolder_%%k
	
		REM Create Files in Current Subfolder
		FOR /L %%l IN (1,1,%count_files%) DO (
		copy /Y %root_folder%\%template_file% .\%%l_%template_file% >NUL
		)
	echo Creating Files in Mainfolder_%%i\Subfolder_%%k done		
	cd /d %root_folder%\Mainfolder_%%i	
	)  
cd /d %ROOT_folder%
)

With that script I am able to reproduce several errors during upload and download.

2 Likes

So just tested with windows 10 Pro and NC3.3.0 - Problem is not solved.
On smaller shared folders ( I have a test server with a user with about 100 files) it works perfectly, on my prod server with files as described above error still exists.

1 Like

I just repeated the test with the NC 3.3.1 Client - problem is still existing.

I would really recommend to use my script listed above and create testdata - then its very easy to understand…

I get network timout notifications instead of operation canceledl, but everything else seems the same: VFS sync never finishes, network connection times out · Issue #3720 · nextcloud/desktop · GitHub

And Again: NC22.1.1 and NC Client 3.3.3 on Windows 10 - enabling VFS leads either to timeouts or other error messages when migration an existing Account with about 80.000 Files.

So still not usable.

I just wanted to ask if this problem is still on the list of topics:
I have now upgraded to NC 23 and Client 3.4.1 - when enabling Virtual File System it will never finish syncing.

It works for a 2nd Account with almost no files, but when I try to sync my Account with ~100.00 files and 320 GB of data, it is aborting all the time.

So currently still not usable for me.

1 Like

Hope to get a resolution on this bug.

after a long while I wanted to test your problem… unfortunately for some reason you cmd script didn’t run on my system so I created a Powershell variant of it.

$folders = 20
$subfolders = 50
$files_per_folder = 100
$template_file ='birdie.jpg'
$rootFolder = 'c:\temp\.....\113917_VFS_huge_folders\'


foreach ($folder in 1..$folders) {
    $folder1 = mkdir (join-path $rootFolder -ChildPath $folder) -force
    #echo ( "[{0}]" -f $folder1)
    foreach ($subfolder in 1..$subfolders) {
        echo ( "[{0}]" -f $folder1)
        echo ( "[{0}]" -f (join-path $folder1 -ChildPath $subfolder))
        $folder2 = mkdir (join-path $folder1 -ChildPath $subfolder) -force
        #echo $folder2
        foreach ($file in 1..$files_per_folder){
            echo ("copy from {0}" -f (Join-Path $rootFolder -ChildPath $template_file))
            
            $childfile =  ("{0:d3}_{1}" -f $file, $template_file)
            echo ("copy to {0}" -f (Join-Path $folder2 -ChildPath $childfile))
            Copy-Item -path (Join-Path $rootFolder -ChildPath $template_file) -Destination (Join-Path $folder2 -ChildPath $childfile )
            #echo (Join-Path $folder2 -ChildPath $childfile )
        }
        
    }
}

After little testing and thinking around the issue

  • if get the issue right it happens only while upload the files from the client to the server
  • if the problem occurs during upload of the there is no relation with VFS as the file exists as “full copy” on the client

nevertheless - I have no client where I can easily create 300GB of data… what is the lowest data size you see the issue manifesting? I tested with 13k files with 8GB total size (all copies of Nextcloud “birdie.jpg”).

  • My client was busy sending the files to the server for long tile utilizing ~50Mbps… (for some reason Wi-Fi was connected only with 144 Mbps so it’s not bad share of the available connection speed)
  • lot of issues where shown in the status window:

  • but it looks like the client kept running and uploaded all the files to the server
  • files appeared in the web UI after reasonable delay
  • until it showed this errors close to the end:

to “recover” after short time

so far I see upload worked - I see same amount of data on the server as I see locally… definitely there are issues during upload… but the result looks good. maybe I was lucky and my data set was small enough…

now I have working setup (was testing OIDC by occasion) so I can generate more data fast an only need to wait for a sync… when I have time I will try with more data!

Most important data:

  • NC client Version 3.5.1 (Windows) with Core i7 11th Gen (mobile)
  • NC Server 24.0.2 (docker > Apache+MariaDB running on old Core i3 Desktop)

Hi @wwe thanks for taking this issue serious. I don´t think that the amount o data is the problem … the amount of files seems to be the problem. so with my script (or your improved powershell version) the birdie.jpg can have 1 kb, it does not matter, but when you create 10 main folder with 50 subfolders on level 1 and 20 on level 2 then you habe easily lots of files. Then you can see that the upload will not work. So during the upload on my side it always crashed, restarted, but crashed all the time so after some days i abort the upload.

The valid workaround for uploading huge data is cuirrently:
move all files (10 Mainfolder with 20 subfolders etc.) out of the nextcloud sync folder, then move one back. when sucessfully uploaded, move the next one back and so on, so finally all files are uploaded. but in one steps it crashes. I´m happy that you proved that there is a issue, so the developes might be able to check, if there is an internal buffer overrun or what ever.

Maybe the following configuration can help:
edit for your mariadb server the config fiole, and add the following lines:

query_cache_size = 268435456
query_cache_type = 1
query_cache_limit=1048576

Then restart the database. Now it works for me on a first test. Next test is: connect from a brand new VM and see if the initial sync will work now with VFS

So, no this settings does not speed up initial scans. The new VM is now starting on a fresh installations to go through the database for every file:

So it still takes almost forever … i think the whole VFS is not tested in environments with lots of files…

Sure the amount of files might be a problem as well. Again my test worked with 14k files despite errors in the UI the upload was successful.

Do you have any idea how many files you need to trigger the problem? Or other way round what is a safe value in according to your tests?

I made successful tests with ~15.000 Files, then I raised the amount of files to 100.000 and the error occured

I’m running now with 6*5000 files… unfortunately I hit test user quota “in between” but it looks good to upload 10k files in one step now… for me this fits almost every realistic scenario…

I’ll run more tests and come back later…