Does the NextCloud client add checksum verifications when uploading?

Most RAMs are known to be prone to bit flipping (although at a low probability) if the RAM is non-ECC. If the NextCloud client uploads without verifying the content (via checksum) after each uploads, then the uploaded content risks corruption.

Here are the simplified steps that I think NextCloud client performs when uploading a file:

  1. NextCloud client reads the file from the hard drive to RAM.
  2. NextCloud sends the data in the RAM to the server through the network.

I’m wondering, after Step 2, is there such a step 3:

  1. NextCloud client and server verify the checksum of the uploaded file. NextCloud client verifies the checksum of the file as it is on the hard drive.

I couldn’t find any checksum files stored on the disk and I doubt whether the NextCloud client performs Step 3.

Hi @xuhdev

If you are looking for traditional checksum files as you imagine, you won’t find them.

However, files are checked for compliance during upload. If even a single bit is incorrect, it will be detected. During upload, a file is divided into chunks. These chunks are uploaded along with their checksums and verified on the server. If all chunks are valid, they are assembled into the original file. If a chunk is corrupted, it is re-uploaded until all chunks are correct. You can see the relevant code here:

and here:


But you can install the checksum app:

App-Id checksum
App-Name Checksum
Summary Creating a hash checksum of a file.
Categories files, tools
App-Version 1.2.4
Repository GitHub - westberliner/checksum: Plugin for Nextcloud and ownCloud to create hashes of files.
Issue-Tracker Issues · westberliner/checksum · GitHub
User-Doc. checksum/README.md at master · westberliner/checksum · GitHub
PHP min/max 7.4 /
PHP min-intsize 32
NC min/max 21 / 29
Not-shipped (not included) App available in appstore
Appstore Checksum - Apps - App Store - Nextcloud

For files where absolute integrity is crucial, I create the hashes myself using custom software/scripts. These hashes are then stored in the directory, and all files can be permanently checked against those checksum files.


Much and good luck,
ernolf

1 Like

Thanks for the detailed reply. The original comment describes the steps that the NextCloud client follows to upload a file:

  1. NextCloud client reads the file from the hard drive to RAM.
  2. NextCloud sends the data in the RAM to the server through the network.

Based on your comment, it looks like the NextCloud client only computes the checksum after Step 1. (And files are divided to chunks.) Would this imply that the NextCloud client is still prune to RAM errors that occurs after Step 1 and before generating the checksums?

The CheckSum app looks promising for solving this issue.

How would you calculate the checksum from a file before loading it into RAM? Then it needs to be pre-calculated taken from a database, or the filesystem (hashed like in ZFS or btrfs).

I see the difficulty now. NextCloud client doesn’t control the file writing process and thus it must read the file again to calculate checksum. Effectively, NextCloud client would need to read the file twice and calculate the checksum twice to verify the checksum, which isn’t so cost effective.

Does NextCloud plan anything to take checksums from the filesystem?

I don’t think so, you cannot rely on client-side to have a file system that supports that (not sure what other FS supports file hashes except ZFS and btrfs).

Perhaps consider additional backups on such file systems, with regular snapshots that date back a while.

What nextcloud could do is use a hash/checksum on the filecache table (the client also has it’s own small sync database) and verify it there as well. I think owncloud at some point had this implemented, not sure if that was working well and what Nextcloud did about that. It might be worth checking a bit the bug trackers for that to gather the information.