Replace webdav by a distributed file system

Guillaume · August 9, 2016, 5:15pm

Current version of Nextcloud uses webdav for file synchronization. This protocol is quite old, and has some limitations that will be difficult to overcome.
For example, incremental synchronisation, scalability, fault tolerance, P2P sync, … are very difficult to implement.
Other software (such as BTsync or Syncthing) provide all these capabilities and are serious challenger to Nextcloud.

Using a distributed FS, such as GlusterFS or Ceph for the file synchronization, would give natively all these functions to Nextcloud. It would be a nice basis for the future. Of course, this would be a big change to the software architecture, but maybe not as big as we can imagine : for example, Ceph is Posix compliant, has a FUSE client, rest interface, S3, Swift, …

Maybe can do it for Nextcloud 12 ??

totalcaos · August 10, 2016, 9:52am

AFAIK GlusterFS is more of a server side configuration to provide commodity redundant storage to a server instead of a high-end storage array (e.g. EMC) and not a mechanism for replication. NC uses WebDAV for file sync between the server and desktop.

Again, setting up Ceph or something similar is a server side configuration, not something that can be set up as part of a NC installation, although it will be good if the client is able to use the librados libraries to sync files. This would depend completely on the availability of Ceph, S3 like object storage on the server side

Guillaume · August 10, 2016, 12:55pm

An example with GlusterFS:

the NC server implement a Gluster brick
each NC desktop client implement a Gluster brick,
all these brick are in replicated mode. Then you have a “Replicated Glusterfs Volume”

Sync Done.

totalcaos · August 10, 2016, 1:00pm

GlusterFS is latency dependent. Since self-heal checks are done when establishing the FD and the client connects to all the servers in the volume simultaneously, high latency (mult-zone) replication or replication over the internet is not advisable. Each lookup will query both sides of the replica. A simple directory listing of 100 files across a 200ms connection would require 400 round trips totaling 80 seconds. Imagine trying this over an high latency internet connection, the system will essentially grind to a halt.

I see what you are trying to do in terms of incremental updates, ability to snapshot, etc, but a replicated filesystem is not the best way to achieve this.

radoeka · October 21, 2017, 4:17pm

https://apps.nextcloud.com/apps/files_external_sia