Issues with Full Text Search

Nextcloud version (eg, 20.0.5): 28.0.2
Operating system and version (eg, Ubuntu 20.04): Rocky Linux9 / linuxserver.io Docker container (latest)
Apache or nginx version (eg, Apache 2.4.25): nginx/1.24.0 (fpm-fcgi)
PHP version (eg, 7.4): 8.2.13

The issue you are facing:

Based on the following article: Guillaume A | Setting Up Nextcloud Fulltext Search with Elasticsearch

I have added an extra docker container to my stack using the following:

  aio-fulltextsearch:
    image: nextcloud/aio-fulltextsearch:latest
    container_name: nextcloud_full_text_search
    environment:
      - TZ=Europe/London
      - ES_JAVA_OPTS=-Xms512M -Xmx512M
      - bootstrap.memory_lock=true
      - cluster.name=nextcloud-aio
      - discovery.type=single-node
      - logger.org.elasticsearch.discovery=WARN
      - http.port=9200
      - xpack.license.self_generated.type=basic
      - xpack.security.enabled=false
    volumes:
      - ./fullsearch_data:/usr/share/elasticsearch/data:rw
    restart: unless-stopped
    networks:
      - nextcloud_fullsearch

I installed the three apps:

I then configured the ‘Full Text Search’ settings:

I was then able to execute:

docker exec -it nextcloud occ fulltextsearch:check

Full text search 28.0.0
{
    "search_platform": "OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform",
    "app_navigation": "0",
    "provider_indexed": "",
    "cron_err_reset": "1706875803",
    "tick_ttl": "1800",
    "collection_indexing_list": "50",
    "migration_24": "1",
    "collection_internal": "local"
}
...

Followed by a test which is fully successfull:

docker exec -it nextcloud occ fulltextsearch:test

.Testing your current setup:
Creating mocked content provider. ok
Testing mocked provider: get indexable documents. (2 items) ok
Loading search platform. (Elasticsearch) ok
Testing search platform. ok
Locking process ok
Removing test. ok
Pausing 3 seconds 1 2 3 ok
Initializing index mapping. ok
Indexing generated documents. ok
Pausing 3 seconds 1 2 3 ok
Retreiving content from a big index (license). (size: 32386) ok
Comparing document with source. ok
Searching basic keywords:
 - 'test' (result: 1, expected: ["simple"]) ok
 - 'document is a simple test' (result: 2, expected: ["simple","license"]) ok
 - '"document is a test"' (result: 0, expected: []) ok
 - '"document is a simple test"' (result: 1, expected: ["simple"]) ok
 - 'document is a simple -test' (result: 1, expected: ["license"]) ok
 - 'document is a simple +test' (result: 1, expected: ["simple"]) ok
 - '-document is a simple test' (result: 0, expected: []) ok
 - 'document is a simple +test +testing' (result: 1, expected: ["simple"]) ok
 - 'document is a simple +test -testing' (result: 0, expected: []) ok
 - 'document is a +simple -test -testing' (result: 0, expected: []) ok
 - '+document is a simple -test -testing' (result: 1, expected: ["license"]) ok
 - 'document is a +simple -license +testing' (result: 1, expected: ["simple"]) ok
Updating documents access. ok
Pausing 3 seconds 1 2 3 ok
Searching with group access rights:
 - 'license' - [] -  (result: 0, expected: []) ok
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_1","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3"] -  (result: 0, expected: []) ok
Searching with share rights:
 - 'license' - notuser -  (result: 0, expected: []) ok
 - 'license' - User number_2 -  (result: 1, expected: ["license"]) ok
 - 'license' - User3 -  (result: 1, expected: ["license"]) ok
 - 'license' - User@4 -  (result: 1, expected: ["license"]) ok
Removing test. ok
Unlocking process ok

And then a full index which I have allowed to complete:

docker exec -it nextcloud occ fulltextsearch:index

However, when I go to do a search in the web interface / iOS app, it’s very hit and miss.

For example, if I search for the word disabled you can see it returns this file, when hovering over, you can see it also include the words tamper and engineer.

If I then search for either tamper or engineer, no results are returned.

Am I missing something? In open honesty, I am running on this stack on a Raspberry Pi 4 with 4GB RAM. Not sure if resource contraints are a contributing factor.

I appreciate any support or feedback.

As a follow up. I completely removed all files from the fullsearch_data folder.

Created a Dockerfile with:

FROM elasticsearch:8.12.0
RUN bin/elasticsearch-plugin install --batch ingest-attachment

Then modified docker-compose.yml to:

nextcloud-elasticsearch:
    build:
      context: .
      dockerfile: Dockerfile.fts
    container_name: nextcloud_elasticsearch
    environment:
      - TZ=Europe/London
      - ES_JAVA_OPTS=-Xms512M -Xmx512M
      - bootstrap.memory_lock=true
      - cluster.name=nextcloud-aio
      - discovery.type=single-node
      - logger.org.elasticsearch.discovery=WARN
      - http.port=9200
      - xpack.license.self_generated.type=basic
      - xpack.security.enabled=false
    volumes:
      - ./fullsearch_data:/usr/share/elasticsearch/data:rw
    restart: unless-stopped
    networks:
      - nextcloud_fullsearch

Brought up the stack and then reset the full text search and triggered a full index with with:

docker exec -it nextcloud occ fulltextsearch:stop
docker exec -it nextcloud occ fulltextsearch:reset
docker exec -it nextcloud occ fulltextsearch:index

Things are pretty much the same, I can search for tampe and get a result but tamper returns nothing:

I’ve tested multiple PDF files, and ODS file. I’ve also tested multiple browsers.

Any thoughts?

Many thanks