Getting hashes on file activity

Running script on file activity

I’m trying to set up a backup solution with file integrity checking where a Nextcloud server will be the main repository against which this backup will run. In my mind, in order to do that properly I need to have hashsums (md5, sha512, etc.) for files stored on the Nextcloud server. I see the following options (in order of preference):

  1. Nextcloud stores file hashsums somewhere - I know that there is table oc_filecache in the database but I can only see etag column there that contains something similar to a file hash. How is the etag calculated (it doesn’t seem to be a simple MD5 or similar - I’ve checked few examples and the etag was different from most popular hash generators)? Is there anything else?

  2. Running a script against every file activity (most likely creation, modification and maybe move) - is there a standard or at least a quasi-simple way to attach a script for file activity? The script would be a simple bash script for generating a hashsum for a given file.

  3. External monitoring for file activities:
    Option 1: Incron - the solution has some flaws (Incron is quite buggy) but it allows to run scripts like in 2. Due to bugs it’s very likely that some file activity will not be caught and the hash won’t be generated.
    Option 2: Direct monitoring of table oc_activity / log files. oc_activity + oc_filecache should suffice for the purpose.

Has anyone done something like that? Any help welcome.

I believe you can just use periodically cron jobs with rsync (good for local, or remote drives) or restic (good for cloud solutions), check the forum, it was already discussed many times:

Thank you for your reply. I have checked these topics already (and much more in the forum) and they don’t bring me a step forward - I need to get some kind of hashsum just after the file has been uploaded/changed/etc. on the server side to have a reference so in like 5 years I’ll be able to tell that this file hasn’t changed (due to bitrot for example).
But - you gave me a hint with your other reply to use the audit app - I’ve checked it’s code and it’s very easy to extend this app to my needs (I wasn’t aware about hooks in apps). All in all I got what I wanted so thank you very much :slight_smile:

#EDIT - if anyone is interested - generating file hashes on server side is planned for Nextcloud 17 release:

1 Like

Just as an Idea. You can do some kind of automation work around. Use tagging App to give new uploaded file a tag:

Then use App to pass this file to external script that could calculate checksum for you or do other job you needed:

There is checksum App, may be you can call it via API somehow.

Hi,

i am very interested in generating file hashes on server side, did you manage find to any solution ?

Martin

https://apps.nextcloud.com/apps/checksum

Thanks, i know about this option, but is there somenthing, what will hash my files automativaly after update and save hah to file ?

Unfortunately, I’m still waiting for this to be completed:

but it seems that this issue doesn’t get any dev attention.

As a temporary solution (working already for 2+ years :slight_smile: ) I made a very simple script to hash all the files in the Nextcloud folder to run just before the backup process.

As a side note, solving issue 11138 would be much better - it would allow client-server comparison of files and possibly fix another problem I run into occasionally - when files get corrupt on the client side (mobile phones especially!) and later on get propagated back into the server…