Security improvements for paranoiac

Is it possible to make synchronization on demand, with explicit synchronization confirmation on both ends: NextCloud and PC/mobile phone?
Synchronization could be initiated via UI of NextCloud portal or via UI of the PC/mobile app. But that initiated synchronization is to be confirmed via UI on both ends: NextCloud and PC/mobile phone?
I would prefer to synchronize my data say once a week or daily, having changes committed into NextCloud like Source Control systems (like Git or SVN) do.
I would like to control what files exactly are being synchronized.

I have about 50K files to store/sync. Imagine that I get some sophisticated virus on my PC which would slowly but surely be corrupting data in my files. By the moment I figure it out (several days of weeks) this virus may corrupt thousands of my files and they immediately would be syncted to NextCloud and ALL other my devices (as synchronization works now).
Just imagine how much time it would take for me to go through all of the changes history on NextCloud portal and file-by-file manually roll back to closest valid version.

On top of this, imaging that you have a virus corrupting data on one of your devices – and that wrong data would be immediately synced into your ALL devices.
The more devices you have the more chance to get a virus and more chance to get your data corrupted on your ALL devices and more chance to almost lose your data.

Confirmation on both sides would be helpful
Scenario:
a virus on your PC is corrupting data in your files and is silently trying to synchronize the changes into NextCloud; unless you confirm on NextCloud side (over web UI) this synchronization it would not complete.
And vice-versa, a virus could potentially get to your NextCloud – in this case your PC/mobile app will not allow synchronization without your confirmation.

With sync on demand the history would be much shorter and more granular and as a result - more controllable.

Bye bye to the cloud and welcome again to file sharing on demand?
:smirk:

Please make sure to refine your requirement by making yourself more familiar with the concepts of Cloud storage in general and the Nextcloud Files cloud storage concepts in particular, I would suppose. This may help to enable a more fruitful discussion in the future, I would hope.

1 Like

Why don’t you run an antivirus software on your Nextcloud server? There is an app that scans files from within Nextcloud too.

Actually I need a backup system working the similar way as a versioned source control systems (like Git). Roles: Server – a machine where backup is stored, Client – the machine where we have data to backup.

Requirements:

  1. Client doesn’t have direct file access to Server;
  2. There is a specific protocol between Client and Server which describe file operations (CUD): Create, Update, Delete, – as well as Read;
  3. Server stores the history of changes. Every item of the history is a change-set. A change-set consists of a set of file CUD-operations. Every change-set is atomic;
  4. Explicit confirmation of synchronization (pool/push) on both sides: Client and Server.

Git/SVN would work but there are many unneeded things for just Backup. For example you normally don’t need to make a comment to any change. The same time Git/SVN doesn’t have confirmation of push operation on the server side.

Idea! Maybe push operation per se must be totally forbidden? Maybe only pull would be permitted in this protocol. In this case push from Client to Server will be replaced with pool to Server from Client. In this situation no additional confirmation (on Client and/or Server sides) is needed because pool operation is always initiated on the side being modified. It would be explicit, it would be virus proof, it would be safe, it would be controllable. Even if we would make pull via Web UI of that Server.

Antivirus software will not help if you get a virus/malware on your Client. It could make a big mess on Server (NextCloud). What if that virus makes a million changes in tens of thousands of your files within extended period of time. In this situation NextCloud history will be flooded and totally useless for a human. And since the period of time is extended, the good changes and bad changes would be totally mixed in that flooded history.

@yshpakov IMHO there is too much in demand and too few difference and too many requirements in contradiction, unfortunately. This does not fit into a reasonabel frame, certainly not a so called “home user”. You should set your priorities.


:shield:

It would be a grave misunderstanding if one would presume the below pairs of terms (i.e. tuple) would meet the same domain for each pair i.e. line.

  • cloud storage - backup
  • backup - archive
  • version control - archive
  • version control - cloud storage
  • database - archive

However, the above tuples may be incomplete. Some items would be even mutual exclusive.

Furthermore, IMHO one should be familiar with concepts like:

However, this tuples cannot be mandatory and may be incomplete.

Hope this helps.
:smile:

From your first post, that is the answer I would give you.

rsync can do a lot, there is for example rsnapshot (https://rsnapshot.org/) that only downloads the changes since last time, identical files are just linked. This way, you can have a history of backups (so you can go back an hour, a day, a month or a year). There is a whole family of these programs with small differences, chose your preferred one.

Ideally the backup server gets the data, so that an intruder can’t compromise your backup.

1 Like

Ideally the backup server gets the data, so that an intruder can’t compromise your backup.

So the backup server would pull data from the source. Not the source would push the data into the backup server?

Please reconsider. You would go in circles and no true backup. The push could write/change/delete any data obviously. This appeared as if some of the core arguments of your own, didn’t it?


Have a look on some more reasonable articles from the more reliable service providers. The below listed article is worth a reading and just an example of many.

Backups ensure that your critical data will survive any of the likely hazards.

Furthermore, there should be some sustainable storage concept if you truly want to have long time archiving of backups (> 10 years). There is much more to consider if you want to do this seriously.

Please continue in looking around and you may accomplish more from some research of your own in the long run. Good luck.
:four_leaf_clover:

Well. Only versioned backup solutions will help against malicious file edits. What’s the problem with this?

Example:

I do file system snapshots every hour for a week
Each snapshot is read only. They are all sent to a backup server.
The backups are then pruned to keep daily snapshots for a month and then weekly for a year.

All very easy to do with Btrfs or ZFS filesystem builtin features.

Obviously you also need antivirus on your server and if possible on the clients.

1 Like

restic.net (+rclone.org)? rclone only if you are not satisfied with the choice of restic supported target.

@yshpakov FYI

Thank you, @TP75

Several years ago when I just started thinking about data safety solutions I decided that I will have at home something like NAS (to store a copy of my data) and syncing it to cloud like DropBox. So I thought about that 3-2-1 backup rule, but thank you for that link anyway.

But I want more than just back up.
I can work with my data in different locations on different computers (at home, at work) so I want

  1. to sync different copies of my data with each other (I can modify one file at work and another one at home)
  2. availability of my data over Internet is a nice thing too. What if I don’t have my laptop with my data with me but I all of a sudden need my data right away

But one more time. I want that synchronization would happen not automatically but exactly when I want it to happen – so manually. And with the option to go thru all the changes not synchronized yet but ready to be synchronized. So I need the total manual control.

And as I mentioned I want to 100% get rid of push operation. Why? Because in case of a virus/malware or some targeted hacker’s attack, as soon as my laptop is under malicious control, my laptop can start randomly modifying my files on the laptop one-by-one and pushing changes to my local NAS and DropBox. Just imagine for a week I would have 1 million of sync operations in my DropBox. The history of my DropBox will be flooded and it would take 1 million years from me to go thru all the history and sort it out.

So with push operation allowed, as soon as I have sophisticated enough attack just only to one of my syncing devices, the history will be flooded on ALL of my devices supporting history. And restoring my data will be an extremely time consumption issue.

Even if I set up a scheduled synchronization – say once a day – , the virus can ignore that and makes 1 million synchronizations a day anyway in spite of my once-a-day setting.

If only pull operation is supported, this scenario is not possible. With pull-only synchronization:

  1. I go to my device A (say a local NAS) over Web interface or over Remote Desktop (RDP) which is even more secure
  2. I initiate one-way pull synchronization with my another device B (say my laptop with possibly a virus): A <= B
  3. device A connects to device B, determines all the diffs, shows me all those diffs to approve
  4. as soon as I approve them, device A applies those diffs to ITSELF ONLY

In this scenario as soon as device A is healthy, the history on device A is totally under control. If device A is infected, it affects only device A but not device B. And so as soon as I find that the data on one of my devices in corrupted, I will just destroy corrupted data – that’s it.

So I would call it security via synchronization protocol limitations (one-way only synchronization or pull-only synchronization)

Ain’t it sound reasonable?

1 Like

When a regular ‘pull’ and a correct ‘rsync’ setup, you are on the safe bank, I presume.
:+1:

Happy hacking.
:sunflower:

1 Like