Issues with Full Text Search

Nextcloud version (eg, 20.0.5): 28.0.2
Operating system and version (eg, Ubuntu 20.04): Rocky Linux9 / linuxserver.io Docker container (latest)
Apache or nginx version (eg, Apache 2.4.25): nginx/1.24.0 (fpm-fcgi)
PHP version (eg, 7.4): 8.2.13

The issue you are facing:

Based on the following article: Guillaume A | Setting Up Nextcloud Fulltext Search with Elasticsearch

I have added an extra docker container to my stack using the following:

  aio-fulltextsearch:
    image: nextcloud/aio-fulltextsearch:latest
    container_name: nextcloud_full_text_search
    environment:
      - TZ=Europe/London
      - ES_JAVA_OPTS=-Xms512M -Xmx512M
      - bootstrap.memory_lock=true
      - cluster.name=nextcloud-aio
      - discovery.type=single-node
      - logger.org.elasticsearch.discovery=WARN
      - http.port=9200
      - xpack.license.self_generated.type=basic
      - xpack.security.enabled=false
    volumes:
      - ./fullsearch_data:/usr/share/elasticsearch/data:rw
    restart: unless-stopped
    networks:
      - nextcloud_fullsearch

I installed the three apps:

I then configured the ‘Full Text Search’ settings:

I was then able to execute:

docker exec -it nextcloud occ fulltextsearch:check

Full text search 28.0.0
{
    "search_platform": "OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform",
    "app_navigation": "0",
    "provider_indexed": "",
    "cron_err_reset": "1706875803",
    "tick_ttl": "1800",
    "collection_indexing_list": "50",
    "migration_24": "1",
    "collection_internal": "local"
}
...

Followed by a test which is fully successfull:

docker exec -it nextcloud occ fulltextsearch:test

.Testing your current setup:
Creating mocked content provider. ok
Testing mocked provider: get indexable documents. (2 items) ok
Loading search platform. (Elasticsearch) ok
Testing search platform. ok
Locking process ok
Removing test. ok
Pausing 3 seconds 1 2 3 ok
Initializing index mapping. ok
Indexing generated documents. ok
Pausing 3 seconds 1 2 3 ok
Retreiving content from a big index (license). (size: 32386) ok
Comparing document with source. ok
Searching basic keywords:
 - 'test' (result: 1, expected: ["simple"]) ok
 - 'document is a simple test' (result: 2, expected: ["simple","license"]) ok
 - '"document is a test"' (result: 0, expected: []) ok
 - '"document is a simple test"' (result: 1, expected: ["simple"]) ok
 - 'document is a simple -test' (result: 1, expected: ["license"]) ok
 - 'document is a simple +test' (result: 1, expected: ["simple"]) ok
 - '-document is a simple test' (result: 0, expected: []) ok
 - 'document is a simple +test +testing' (result: 1, expected: ["simple"]) ok
 - 'document is a simple +test -testing' (result: 0, expected: []) ok
 - 'document is a +simple -test -testing' (result: 0, expected: []) ok
 - '+document is a simple -test -testing' (result: 1, expected: ["license"]) ok
 - 'document is a +simple -license +testing' (result: 1, expected: ["simple"]) ok
Updating documents access. ok
Pausing 3 seconds 1 2 3 ok
Searching with group access rights:
 - 'license' - [] -  (result: 0, expected: []) ok
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_1","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3"] -  (result: 0, expected: []) ok
Searching with share rights:
 - 'license' - notuser -  (result: 0, expected: []) ok
 - 'license' - User number_2 -  (result: 1, expected: ["license"]) ok
 - 'license' - User3 -  (result: 1, expected: ["license"]) ok
 - 'license' - User@4 -  (result: 1, expected: ["license"]) ok
Removing test. ok
Unlocking process ok

And then a full index which I have allowed to complete:

docker exec -it nextcloud occ fulltextsearch:index

However, when I go to do a search in the web interface / iOS app, it’s very hit and miss.

For example, if I search for the word disabled you can see it returns this file, when hovering over, you can see it also include the words tamper and engineer.

If I then search for either tamper or engineer, no results are returned.

Am I missing something? In open honesty, I am running on this stack on a Raspberry Pi 4 with 4GB RAM. Not sure if resource contraints are a contributing factor.

I appreciate any support or feedback.

As a follow up. I completely removed all files from the fullsearch_data folder.

Created a Dockerfile with:

FROM elasticsearch:8.12.0
RUN bin/elasticsearch-plugin install --batch ingest-attachment

Then modified docker-compose.yml to:

nextcloud-elasticsearch:
    build:
      context: .
      dockerfile: Dockerfile.fts
    container_name: nextcloud_elasticsearch
    environment:
      - TZ=Europe/London
      - ES_JAVA_OPTS=-Xms512M -Xmx512M
      - bootstrap.memory_lock=true
      - cluster.name=nextcloud-aio
      - discovery.type=single-node
      - logger.org.elasticsearch.discovery=WARN
      - http.port=9200
      - xpack.license.self_generated.type=basic
      - xpack.security.enabled=false
    volumes:
      - ./fullsearch_data:/usr/share/elasticsearch/data:rw
    restart: unless-stopped
    networks:
      - nextcloud_fullsearch

Brought up the stack and then reset the full text search and triggered a full index with with:

docker exec -it nextcloud occ fulltextsearch:stop
docker exec -it nextcloud occ fulltextsearch:reset
docker exec -it nextcloud occ fulltextsearch:index

Things are pretty much the same, I can search for tampe and get a result but tamper returns nothing:

I’ve tested multiple PDF files, and ODS file. I’ve also tested multiple browsers.

Any thoughts?

Many thanks

Full Text Search is fiddly to get up and running. Ive been chipping away at it on and off for a few weeks.

With that said, there are a few fundamentals you should probably check

This command doesn’t look right to me (but I’m not familiar with Rocky or the linuxserver implementation of NC). When entered, do you get a screen that looks like this? https://raw.githubusercontent.com/nextcloud/fulltextsearch/master/screenshots/OccFulltextsearchIndex.png

If not, try:
docker exec -it -u www-data nextcloud php occ fulltextsearch:index

If you do get something that looks like the above screen, are you sure indexing has finished? It could take a while depending on how much data needs indexing.

Are the files external? If so, change the external files to index path and content.

Are the files formatted strangely or corrupted?

The repo has more information that might help your troubleshooting: GitHub - nextcloud/fulltextsearch: 🔍 Core of the full-text search framework for Nextcloud

You may also want to put a post in the linuxserver forums to find out what changes the linuxserver team have made to the stock NC image.

good luck!

1 Like