Will Elastic Search update index automatically?

Hi All

Will a correct and working configuration of ElasticSearch fulltext search autmatically update the index as files are added? Or is there something we have to cron?

sudo -u apache php /var/www/html/nextcloud/occ fulltextsearch:index
worked correctly in creating an index of all the files that are currently in the instance, but what about newly added files?

Regards

Hans

After the files were indexed for the first time, you don’t need a cron job for updating the index as this is done in the context of NC’s cron job. Means that you have to wait for the next cron job is scheduled before you can find new/changed documents with the full text search.

If in doubt, you can try it by adding a new document and wait for the cron jonb to run. After that, you should be able to find the newly added document in the full test search.

On NC 14 there will be a “live” feature and I assume that this will trigger an update as soon as new/changed files were detected.

1 Like

@DecaTec thank you for the explanation. I will have a look at NC 14’s behaviour. Something worth asking for any other newcomers here, one should probably run the initial index manually, before adding the nextcloud cronjob if your instance is very big. That way you will not spawn multiple indexing processes? Or is this illogical of me to wonder about? :neutral_face:

Well, usually one of the first things to do on a new NC instance is configuring the cron job for every 15 minutes. In most cases, the full text search will be installed afterwards.
From my own experience: Cron runs every 15 min and the initial indexing took over an hour on my instance.

As long as there are no errors shown while first indexing, everything should be fine.

1 Like

Thank you so much. We have 10+ million files in production. I think our index will take a wee bit longer. But this is good info for any newcomers. Thanks again @DecaTec

1 Like

More than 10m files?! That’s pretty much. I just took another look at my NC: ~1h indexing for ~1500 files (mostly on external storage).
I would be interested to know how long initial indexing takes on your instance.

Funny you should say that, everyone seems to be interested in it :smile: Will bookmark this thread and report back if it finishes. (Note the if, not when :wink: )

1 Like

Hi,

I don’t know if my setting is correct or not but I need to add a cron job in system level. I am using nextcloud docker container.

And yes, if cron job is up and running, indexing is automatic.

Alex

It looks like it failed?

Nope, haven’t done it yet on Production system. Waiting for downtime in December to kick it off.

1 Like

oh yeah my files more than 12m i think scan index need more than 24 hours 16G mem

hmmm …
my NC14 with ElasticSearch fulltext is working correct !

But new Documents (pdf, txt …) is always after 7 Days in the index, by added Documents 1-7 Days, i have in Fulltextsearch no results in the “new” documents.

hi, does it trigger also the external stores and how often?
and if 10 users have the same external datastore configured global, does it index the files ten times?

Worth noting… Indexing doesn’t seem to occur inside the Docker NC image via the cron.php. I currently have a containerized deployment of NextCloud 17 and I have to either run fulltextsearch:live as a service on the host OS or run a cron job alongside the standard cron.php that runs fulltextsearch:index or live.

Possibly something wrong with the the NC docker image, not sure. I have had to edit the cron.php to not log out this error:

[fulltextsearch] Warning: Exception while live index: OCA\FullTextSearch\Exceptions\RunnerAlreadyUpException - Index is already running

Looks like the live service and cron.php don’t play nice together.

I have upgraded to Docker NC 18, I confirm that cron.php will not index new/update files. Any clue anyone ?

Maybe related to this issue

What was your result? How long did it take?
I’m currently indexing about 400000 files in groupfolders for 80 users and it’s been running for 2 days :confused:

Don’t know when search will start? Will it work and will it index everything? I’m still not looking for anything :frowning:

I will ask again,

if you are using 10 people with the same external storage
will the indexer index the same files ten times?

chriz

Out of the blue logic i would say no, because Elasticsearch index the files on the storage directly and not over the users file listings.

it will, and it is kinda dumb, but the logic behind it is that every user will only be able to search for the files they can access.