Nextcloud is a great product, and the company’s philosophy is exemplary as well. However, as a long-time user, it pains me to see how new features are constantly introduced while the foundation upon which everything is built is entirely overlooked, and deficiencies are left unaddressed.
So, what do I mean by this?
At its core, Nextcloud is a tool for storing files and exchanging them. And here is where, from my perspective, the foundation is “broken by design.”
The focus is on the user, rather than the data that needs to be shared and stored! This becomes very apparent through two problems, both of which stem from the architecture and usage of the database. A detailed description of the problem can be found here: GitHub Issue Link
Brief Description of Problem 1:
Users and groups from the Active Directory.
Integration of shares from the Windows file server with the option to store logins in the database. This essentially turns Nextcloud into a web frontend for the MS file server. Permissions for actions on files are determined by ACL on the file server, which is not an issue, as external shares in Nextcloud can be mounted as read-only or writable.
However, problems arise when the Nextcloud client is used to sync files to a client. If User A modifies a file and User B is allowed to access the same file, User B will not receive the changes made by User A.
Brief Description of Problem 2 (actually two problems):
Here again, the data is stored on an external storage, and users are coming from the Active Directory.
There are 53 users in Nextcloud, of which 24 are from the AD, and the rest are local Nextcloud users. The problem is that the Nextcloud database is HUGE! It’s a whopping 28.2 GB according to the web interface. The externally mounted share encompasses over 3TB of data, with more than 3,000,000 files and 300,000 directories. Another problem arises when individual directories are shared from AD users to Nextcloud-local users to exchange data. While this works, the displayed data isn’t truly up-to-date.
Normally, accessing data through a web browser would update as changes occur on the external storage, even if those changes weren’t made through Nextcloud itself. This is because the browser triggers a refresh of the directory, detecting new and modified data. However, in this case, the refresh isn’t effective, taking too long or not being limited to the directory being viewed.
So, why is the database so massive?
Unfortunately, a separate file index in the oc_filecache is created for each user! An entry is generated for each of the 3,000,000 files for every user. For 24 AD users, that’s 72,000,000 (seventy-two million entries) in the oc_filecache. Each new user adds another 3,000,000 rows.
This is highly inefficient and problematic. Nextcloud should behave more like a file system at this point, requiring just one entry per file.
The problem lies in how paths are constructed in Nextcloud: “user/files/share.”
Whether it’s external or internal storage, the structure and fundamental problem are the same. Running “files:scan –path=“user/files/share/” user” takes three hours to complete. This seems to be the reason the browser display isn’t current. The browser-initiated refresh doesn’t seem to be restricted to the current directory or limited to just F5; it starts at the share level and never reaches the correct directory.
For an updated display, it would be helpful if there were a continuous refresh parallel to the scan, specifically for the directory being viewed, thus enabling faster access to new data.
Optimization could be achieved by implementing a Windows service for the file server that scans the MFT and relays changes to Nextcloud. This way, it would only need to compare which data is more current, that from the service or what Nextcloud already has.
The oc_filecache is also the issue for Problem 1. There are no columns indicating who changed what and when, nor is there information about who has access to the file. If Nextcloud behaved more like a file system, these problems wouldn’t exist. The oc_filecache should include details about when a file was changed by whom, and whether a push signal to reload the file is necessary for all clients. One entry per file should be sufficient. The oc_filecache should only indicate which groups are allowed to read/write/delete/share.
This could even be managed through system groups, whose membership is determined by other tables. Access rights through groups are already a part of Nextcloud; they just need to extend down to the necessary level.
Unfortunately, the same issue affecting the file index applies to the full-text index, where each user has their own full-text index. However, it should exist only once per file, with the same access rights as the file it’s for.
I can only hope that these fundamental design issues are addressed before the next AI integration is introduced. Currently, the behavior of Filecache and the Full-Text Index is compounding an issue that grows exponentially with the number of users and files.
Kind Regards
Chonta
PS: Don’t get my wrong, I love Nextcloud realy. Only i would love to see it uses the full potential.
I wil gladly answer questions.