Sync of two identical directory trees after database loss extremely slow

After a database corruption, the nexcloud client seems to copy every single file over the net where I would have expected a fast and efficient behaviour as in rsync or unison where only missing/changed files are transferred.

Due to one handling error on my side, the database of one small nextcloud instance on a raspi was destroyed yesterday without backup (heck, I usually have multiple backups for everything, but I forgot this one!!)

OK, but I though that after restoring the nextcloud instance, re-syncing would not take too much time, because the data repositories on the server and client side are identical.

I assumed that in case that a file is present on both sides, the client and the server would just exchange checksums as Rsync does rather that physically transferring the file, so that the whole process wouldn’t take too long.

It seems that I was wrong, or was I? My laptop now has been running over night and it still has about 100 GB more to sync. A rsync would have been through for ages now. Can’t the nextcloud client understand that the files on both sides are identical without transferring the complete file over the net first?

Tried with client versions 2.4 and 2.6

Nextcloud version: 17.0.0
Operating system and version: Raspbian Buster
Apache: 2.4.38
PHP version: 7.3-fpm

1 Like

It’s no wonder that the sync take their time, since when you lost the database, you also lost all information about all files. It doesn’t matter if the files are already there, their state is unkown to NC unless saved in the database.

Running a local filescan on the RPi, before hooking up the clients again, might have helped, since that would have rebuilt the oc_filecache table, but I am not absolutely sure about that.

Another issue might be the database, which will have to ingest just a lot and depending on the database settings and the speed of the volume, where the database is residing, this can also impact it’s performance.

It’s no wonder that the sync take their time, since when you lost the database, you also lost all information about all files. It doesn’t matter if the files are already there, their state is unkown to NC unless saved in the database.

Well, rsync and unison both show that it doesn’t have to be that way. Rsync doesn’t even maintain a database, and still it would have been through with the job within minutes. Unison would have rebuilt the database and then skipped every file that is identical on both sides. Both approaches would have been at least 100 times faster. I know that because I use rsync for daily incremental backups of the whole NC instance (except for the database, which was a big mistake as I now see), and the cronjob takes just minutes to complete.

I suppose that the real culprit here is that the WebDAV protocol doesn’t allow for such an approach. Rsync requires rsync instance on both sides and works with shell access via SSH. Both instances exchange checksums, so that the whole file doesn’t have to be transferred, only the differerences. The same goes for Unison. Whereas the NC client seems to be just a “stupid” webdav frontend. Still I expected better…

Running a local filescan on the RPi, before hooking up the clients again, might have helped, since that would have rebuilt the oc_filecache table, but I am not absolutely sure about that.

I did that, still it takes ages. When there is a local database and a database on the server, why can’t the client and the server check both databases against each other?

1 Like

rsync and unison are using different concepts. rsync is always a 1-to-1 sync, whereas NC performs a many-to-many sync. Also, rsync operates on the local file system itself, whereas NC doesn’t do that. It needs the server database to store the states and locations of all the files that are involved. NC virtualizes the file system to be able to provide features like sharing and groupfolders.

If you are only interested in rsync-like file synchronization, there are tools, which are probably better suited for that.

To me that doesn’t explain why, when you have a local database, presumably containing checksum, modified date, size etc and a remote database containing the same, it shouldn’t be possible to do a quick sync between both. It doesn’t seem like both databases are actually checked against each other but every single file is transferred and then checked, at least that’s what the extremely slow speed and the net traffic indicate

1 Like

I don’t think that the desktop client uses any database - at least I didn’t find one. However, you lost your server-based database and that is what makes the sync so slow.
When I said, that maybe performing a filesync on the NC server itself, you said, that you did that. This should have restored the database on the server to the state, where all files are known. However, this doesn’t restore shares or groupfolders, which would cause at least those shared files to sync - and duplicate for that matter. However, to be sure, you did something like this on your NC server:

sudo -u <webuser> php occ filescan --all

Indeed, that’s what I did and I manually recreate the shares, of which there weren’t a lot, because that’s my private instance only used by my family.

I don’t assume that the files db on the server notes sync states for individual clients. But you’re right, there is no database on the client side, apparantly. Well, maybe that explains why there is long tedious “looking for changes” when the client starts up even in normal situations.

Well… most of the sync speed is up to the database performance and the number of files, that will need to be synchronized. Have a look at your process list on your NC host - it’s a RPi? Maybe use htop to monitor your database and see, if it’s hitting the limit.

I am, I have also done some fine-tuning such as configuring redis, installing php-fpm, configure http2, increase the limit on the number of children php-fpm can spawn etc. However, the principal limitation seems to be that all the files are being sent across the network even though identical copies are already present on the other side. If both sides where sending checksums as soon as they notice the presence of the same file on both side and simply verify that the checksum, modified date and file size match, you wouldn’t have to send the actual file, and the process would be much much faster.

Again, maybe this is a limitation in the webdav protocol, but the nextcloud client seems to extend it anyway, for info on shares etc. Maybe this is not a priority yet, because complete loss of your DB is just so rare, but I can’t see how a better technical solution would be impossible.

1 Like

I just checked again and I actually found the local sync db. It is located inside the root of the local target folder and it is a hidden SQLite database. I looked at it, but I couldn’t directly correlate the info in that DB to the one on the server, but I’d assume that the table metadata will somehow tie data in the server’s oc_filecache table to the local files and their local hashes…

If you dump or re-create the database on the server, all the hashes in the table oc_filecache will change, which is what likely caused the full syncs from the local clients.

ah, it is those ._sync* files.
But what should not change are the checksums. They would change only if the file is actually modified. When there are two identical files in two different places, they will have the same checksum. Apparently, the NC client does not make use of this fact.

1 Like

Take a look at the oc_filecache table in the NC database on your server. My guess is, that the local client somehow tries to match the local hashes with the ones in oc_filecache.

As far as creating complete checksums for each file goes… that would place a significant burden on the NC host, since it would have to read all files in their entirety in order to compute those - this is probably not feasible for an installation on a RPi and even on bigger servers, this would cause a huge impact. Afaik, the sync takes the filename, timestamp and size into account, when it compares, but before that, the client will have to determine, if the files already exist on NC and that is done though the oc_filecache table.

One would have to look at the source of the NC client to know for sure, how this works, though.