Backup for Nextcloud with S3 primary storage

,

Hi

I am using Nextcloud 19.0.3 with S3 as primary storage. What is the best method to backup contents in S3? I don’t have much experience with the Nextcloud. But I know it’s somehow linked with the Database to display the files. As files in the S3 are shown in some random names( like urn:oid:11121).

As an admin, I want to restore files even if the user purposefully deletes the files(Trash also) or files are deleted through some malicious attack. Is it possible?

  1. I saw there is a versioning option. Is it suitable for my case? if the user deletes the file reference in DB will also get deleted right? So can files be restored?
  2. Nextcloud is already Live. Is it possible to implement versioning now? how it can be done?
  3. Please let me know if there is an alternative method for backup or I there any other documentation link regarding S3 as primary storage backup.

Do you really think that it is a good idea to backup object storage to a flat file system?
I think then you can better use flat file system for server and backup.

I think and hope that S3 is a secure space without extra backup because of integrated backup locations.

For restore from S3 read:
How to restore backup of only one user when we are using nextcloud with s3 being the primary storage?

AWS S3 or S3 compatible storage?

AWS S3 or NC versioning?

install and enable the versions app? talking about nc versioning. aws s3 versioning -> aws console.

ever considered professional support from Nectcloud GmbH? your case looks like you would need it.

If you are using AWS S3 that ‘backup’ can also acheived via cross region replication. ( https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazon-s3/ )

How many users do you currently have on your installation as @Reiner_Nippes identified it sounds as if your case might be best resolved with Enterprise support from Nextcloud GmbH.

How come your organisation implemented NC without considering the full environment it needed to utilise?

then you have a urn:oid:11121 object in another region and wondering where it belongs. or?

unless there are no “log files” to tell you that it is important_letter.doc and belongs the joe smith imho it’s pretty useless.

The poster hasn’t asked about the S3 object identifier, and as the objects are replicated, solving the URN:OID issue in the primary region bucket will resolve the issue of the ‘unknown’ URN:OID’s in the replicated bucket.

However the original post related to ‘backup’, which in itself is many things to many people.

There is also very little detail in how the OP has setup their ‘S3’ primary storage and if it’s AWS or S3 compatible.

We use S3 compatible as primary storage, with region replication and has never experienced the issue of apparently orphaned URN:OID’s

My understanding is that NC allocates the urn:oid:xxxxxx and stores it in a lookup table effectively, between the actual and object name. So querying the NC DB will give the original file name?

So I am assuming the OP is also backing up the NC DB somewhere? Might as well be to the NC S3 bucket, and have the DB backups replicated cross region too for better disaster mitigation.

i had this in mind. @NeptuneUK

i’m not sure if you could just copy the s3 objects to another region instead of syncing. one could use external tools if aws doesn’t provide this as a feature.
so in this case you would end up with a lot of orphans. or?

Thanks, everyone for the quick replay :slight_smile:

@devnull

I also consider AWS S3 as a good option. But in my client office, they were using some other storage facility. We just implemented NC as an alternative as they were having issues with their storage facility. The issue occurred when one of the employees resigned and while leaving he/she deleted the files. So my client doesn’t want to happen it again with NC.

@Reiner_Nippes

  1. AWS S3 or S3 compatible storage?
    Ans: AWS S3 is used and Nextcloud is installed on ec2 server both are in same AWS region.

  2. AWS S3 or NC versioning?
    Ans: I am not sure which one use. I don’t want to lose existing user data. As per my understanding. If I install and enable NC versioning, it uses S3 versioning in the background right?

  3. install and enable the versions app? talking about nc versioning. aws s3 versioning -> aws console.
    Ans: Any suggestions which one to use?. I was planning to use NC versioning and I hope if I have installed and enable it now it won’t affect existing user data right?

@NeptuneUK

Thanks, Cross-region replication of file along with DB is also a good option for backup. As I don’t know working NC, so my concern was about losing of reference in the database to files when the user deletes the files. As per my understanding backing up files alone is no use when AWS S3 primary storage is used. But in my case, as the user is purposefully deleting the files. I think files will also get deleted from the replicated region. right?

Also restoring the files of a single user with the help of DB will be a huge problem.

I will also consider Nectcloud GmbH support.

@Bazil_Joseph
Sorry i do not use AWS S3. But i think there must be a version control and history backup that not a single user can delete the data. And perhaps it costs money.I think you must also have in AWS S3 backups from yesterday, last week, last month, …
If you not trust AWS S3 for temporal backups but more your own backups please do not use AWS S3 for primary storage. It makes no sense.

the object is delete by nextcloud. of course end user should have no access to s3. but you can’t stop an nc user from deleting his/her files unless you revoke his/her credentials. or?

i guess that is is your “no go” with s3 as primary storage.

what i would suggest (talking about aws):

  • use normal esb stroage. if cost matter use the cheap hdd one.
  • make backup with restic.net/rclone.org to aws s3.

so you would have a “normal fs based nextcloud”. and your backup is stored on cheap, reliable storage. restic can “mount” it’s archive and you browse through all files and versions.

as described here: Nextcloud Backup and Restore

i’m not sure. but aws s3 versioning isn’t used neither by nextcloud nor by restic. that is to say if you turn on aws s3 versioning neither nc nor restic are not aware about that feature. so if you change a document in nextcloud that would result in two different urn:oid:xxxxxx objects. not two version of the same object. (someone with deeper knowledge may confirm or correct this.)

aws versioning could make sense to protect your documents from mal-/ransomware. the only thing you have to do is configure everything in this way that an attacker won’t get hand on the credentials to turn off versioning and would be able to delete old versions. i think that can be achieved with different iam roles. so if ransomware is encrypting all your docs in nextcloud and get hands the back mechanism you would have still and older version of the backup. hope you get the idea.

if your documents needs to be Armageddon proved it would make sense to use the cross region replication from aws.

want to test: -> https://github.com/ReinerNippes/nextcloud/tree/nextcloud-reloaded to setup an nc with esb storage and aws s3-restic backup in 20 minutes. follow the readme.

The problem is: user make errors
You need different backup versions.
A backup or RAID does not help.

nope. (means yes. of course.) but i was talking about a giant flood in norther virginia. and the end of region us-east. half of the internet will be gone and your data. that’s Armageddon. :wink:

then a cross region replication to eu-central may be helpfull. because your backup version is on another continent.

1 Like

@Reiner_Nippes
Yes. But i talk about another Armageddon. The user deletes his data and notes it a only a week or month later. Where is the backup from last week or last month? On-Prem or cloud?

talking about backup with restic it’s in the backup archive. you’ll find each backup in folder hosts// if you mount this archive with restic.

talking about s3. i don’t know about a simple way to do the same backup with s3. you can’t protect against a user deleting a file and want to have it restored weeks later.

ok?

@Reiner_Nippes
Oh. And is there also no solutions for Public Cloud like Microsoft Office 365 and OneDrive? Than S3 or Object Storage is trash.

to be more precise: backup for a nextcloud installation with s3 as primary storage.

aws s3 as a backup target for restic is just perfect.

if you plan to use s3 as primary storage you should know what you are doing.

if you plan a backup target for restic just choose from the long list.
native restic: https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html
restic/rclone combination: https://rclone.org/overview/

Thank you @Bazil_Joseph for providing an organized and clear response to follow-up questions by members of the community.

Why do so many cloud storage systems, including Amazon S3, still use MD5 hash?

  • because it’s good enough
  • because it’s a pain in the … to change it.
  • abandoned projects
  • lack of awareness
  • ignorance

Yes and because of or because of not https://en.wikipedia.org/wiki/Birthday_problem

If version roll-back is something you’re after why not just use:

As for if a user deletes files why not just get the user to make use of the deleted files functionality of nextcloud to retain deleted files and user restore them?

This would seem to completely fulfil your stated senario and is already available within nextcloud. No other infrastructure to implement, configure or manage!

If you wanted the belt and braces, then cross region replication of files and databases is the additional implementation you need. AWS S3 by design is, within a region, safe. AWS S3 file durability is detailed here: https://aws.amazon.com/s3/faqs/