Nextcloud 26.0.5, ElasticSearch 8.10.0 plus groupfolders.
EDIT: nginx with php-fpm 8.2
Rebuilding the Fulltextsearch index takes ages. On a comparable instance without group folders, a full rebuild for 15 users, approx. 50000 files takes half an hour. On the instance with ~20 group folders, it takes ~5s for one file => have to wait three days.
The machine (VM) is equipped with 12G, 4VCPUs. Even while indexing, the load average does not rise above 1.5. I’ve observed that occ fulltextsearch:index runs as a single thread that spends 1/3 of the time in mariadb and 2/3 in PHP. The system does not wait for I/O: sum of time spent waiting for I/O is mostly below 20%.
Nobody answers here? The observed behavior makes this app really unusable for productive use – imagine some thousands of users, hundreds of groups and group folders, and millions of files! They would wait years for the index to be built.
You are asking in a community forum, thus, I assume most users here don’t manage thousands of users.
Maybe you can provide more detailed steps how to reproduce this issue and how to measure the execution time of the search. Also the filetype would be important to know.
You are asking in a community forum, thus, I assume most users here don’t manage thousands of users.
You’re right, but the community is deemed the testbed for bigger installations, and it’s the same code being used in production. Therefore I think it’s my responsibility to alert about the problem. Of course, for a private installation the wait time of three days isn’t that problematic. But look at it at a bigger scale: it is more than factor 100, and it could easily mean that reindexing takes years for somebody using this software in production.
Maybe you can provide more detailed steps how to reproduce this issue and how to measure the execution time of the search.
It is not about the execution time of the search. It is about upgrading your Elasticsearch platform, where you are required to rebuild the index. (or for completeness: there can be other situations where you want to rebuild the index, you know). How to reproduce the problem?
you have a Nextcloud instance, (versions already said in thread start). Fulltextsearch is installed and working.
For some reason, you are required to rebuild the Fulltextsearch index. So you do that according to the docs: occ fulltextsearch:reset (to recreate the index), occ fulltextsearch:test (to check it basically works) and occ fulltextsearch:index (to rebuild the index).
For some time, you watch the index status display. Until you notice that the file name being processed changes very slow. So you start watching the Nextcloud log, and you see: for EVERY file there are as much as log entries as you have users, and the periodicy of these messagess (checking if file X is accessible by user Y) is approx. one second. Uuuh, let’s make another coffee, and wait…
You ask your favourite search engine about slow indexing, and some results say ‘using group folders terribly slows down indexing’. You remember you had enabled group folders once ago, just to see what they are, and forgot to deactivate when you didn’t have any use for them.
You deactivate group folders and repeat the indexing: and instead of three days, now it takes half an hour to index all your files.
So here I am, asking for people sharing their experience, and hoping for the devs to listen if there’s a serious problem.
Initial indexing seems to run user by user (not storage by storage). Thus adding more groupfolders will add some overhead in any case. But you write indexing takes 5 seconds for each file. Typical pdf/text files are indexed within milliseconds on my instances and only if i use ocr, it takes a couple of seconds for each file. Thus, the question would be what kind of files you have that it takes so long.
Not sure if this improves your situation in regards to groupfolders. Maybe you want to try out and report back.
If your issue is unrelated, I assume you’ll have to create a new issue to let the developers know!