Fulltextsearch:index runs, indexes files, no content

I am running Nextcloud in a docker container, and trying to set up fulltextsearch using elasticsearch in another docker container. I have many PDFs that have been previously OCRd before uploading to Nextcloud. I can search on document names, but cannot search on the content of these files.

I have the ingest-attachment plugin enabled according to the docs on the bitnami/elasticsearch site, but am receiving an error from occ fulltextsearch:index stating:

There are no ingest nodes in this cluster, unable to forward request to an ingest node.

I’d appreciate any help anyone can give.

Here is my Docker-compose.yml:

version: '2'

services:
elasticsearch:
image: bitnami/elasticsearch:latest
container_name: elasticsearch
user: root
environment:
- ELASTICSEARCH_PLUGINS=ingest-attachment
volumes:
- ./data:/bitnami/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
networks:
BACKBONE:

fulltextsearch:test

abc@ab6982ab2c21:/config/www/nextcloud$ php ./occ fulltextsearch:test

.Testing your current setup:
Creating mocked content provider. ok
Testing mocked provider: get indexable documents. (2 items) ok
Loading search platform. (Elasticsearch) ok
Testing search platform. ok
Locking process ok
Removing test. ok
Pausing 3 seconds 1 2 3 ok
Initializing index mapping. ok
Indexing generated documents. ok
Pausing 3 seconds 1 2 3 ok
Retreiving content from a big index (license). (size: 32386) ok
Comparing document with source. ok
Searching basic keywords:

  • ‘test’ (result: 1, expected: [“simple”]) ok

  • ‘document is a simple test’ (result: 2, expected: [“simple”,“license”]) ok

  • ‘“document is a test”’ (result: 0, expected: []) ok

  • ‘“document is a simple test”’ (result: 1, expected: [“simple”]) ok

  • ‘document is a simple -test’ (result: 1, expected: [“license”]) ok

  • ‘document is a simple +test’ (result: 1, expected: [“simple”]) ok

  • ‘-document is a simple test’ (result: 0, expected: []) ok
    Updating documents access. ok
    Pausing 3 seconds 1 2 3 ok
    Searching with group access rights:

  • ‘license’ - [] - (result: 0, expected: []) ok

  • ‘license’ - [“group_1”] - (result: 1, expected: [“license”]) ok

  • ‘license’ - [“group_1”,“group_2”] - (result: 1, expected: [“license”]) ok

  • ‘license’ - [“group_3”,“group_2”] - (result: 1, expected: [“license”]) ok

  • ‘license’ - [“group_3”] - (result: 0, expected: []) ok
    Searching with share rights:

  • ‘license’ - notuser - (result: 0, expected: []) ok

  • ‘license’ - user2 - (result: 1, expected: [“license”]) ok

  • ‘license’ - user3 - (result: 1, expected: [“license”]) ok
    Removing test. ok
    Unlocking process ok

    fulltextsearch:check

    abc@ab6982ab2c21:/config/www/nextcloud$ php ./occ fulltextsearch:check
    Full text search 1.4.1

  • Search Platform:
    Elasticsearch 1.5.1
    {
    “elastic_host”: [
    http://192.168.10.6:9200
    ],
    “elastic_index”: “nextcloud_index”,
    “fields_limit”: “10000”,
    “es_ver_below66”: “0”,
    “analyzer_tokenizer”: “standard”
    }

  • Content Providers:
    Files 1.4.2
    {
    “files_local”: “1”,
    “files_external”: “1”,
    “files_group_folders”: “1”,
    “files_encrypted”: “0”,
    “files_federated”: “0”,
    “files_size”: “100”,
    “files_pdf”: “1”,
    “files_office”: “1”,
    “files_image”: “0”,
    “files_audio”: “0”
    }

    When I run fulltextsearch:index, I get the following error:

    ┌─ Errors ────
    │ Error: 458/458
    │ Index: files:1690
    │ Exception: Elasticsearch\Common\Exceptions\ServerErrorResponseException
    │ Message: There are no ingest nodes in this cluster, unable to forward request to an ingest node.


    └──

When I run a curl command from the nexctcloud contaner, and a different computer on the network I get:

$ curl -X GET “192.168.10.6:9200/?pretty”
{
“name” : “b5b6f04cdd47”,
“cluster_name” : “elasticsearch”,
“cluster_uuid” : “t8FhwXhzSQGHXKf232O0Hw”,
“version” : {
“number” : “7.6.2”,
“build_flavor” : “oss”,
“build_type” : “tar”,
“build_hash” : “ef48eb35cf30adf4db14086e8aabd07ef6fb113f”,
“build_date” : “2020-03-26T06:34:37.794943Z”,
“build_snapshot” : false,
“lucene_version” : “8.4.0”,
“minimum_wire_compatibility_version” : “6.8.0”,
“minimum_index_compatibility_version” : “6.0.0-beta1”
},
“tagline” : “You Know, for Search”
}

1 Like

A bit late to the party I guess, but it may help others.

The error relates to the fact that you’re running a single-node cluster with Bitnami’s dockerized ES. This project has it so that the cluster spawns its regular (non-dedicated) nodes with both the “master” and “data” roles, but not “ingest”. Therefore, indexing fails when attempting to run ingest pipelines.

See the actual source code for the logic.

Unfortunately, there is no easy fix at the moment. I have a local fork of the projet which takes care of the issue and some related ones, but it’s not ready to ship yet. Pull Request on its way.

One may either:

  • override the entire elasticsearch.yml config and set node.ingest = true (not bad, but future releases of bitnami/bitnami-docker-elasticsearch may break under your now hard-coded elasticsearch.yml)
  • add another, dedicated node to the cluster with the ingest role on (not bad if you can afford it)
1 Like

Thanks for your help!

I followed your advice, and created a cluster of two servers. One was not a dedicated node, and the other was a dedicated ingest node. I don’t know if my docker-compose was configured properly, but I kept getting an error:

ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]

After researching the problem for a couple of days, I suspected I needed a second master node. I recreated my cluster with two master nodes (not dedicated) and one dedicated ingest node. Things seemed to work properly with elasticsearch, except now I get an error during fulltextsearch:test :

 TypeError: Return value of OCA\FullTextSearch\Model\SearchRequest::getProviders() must be of the type array, null returned in /config/www/nextcloud/apps/fulltextsearch/lib/Model/SearchRequest.php:114

So far, I can’t figure out this new error.

On a side note, to help anyone else that may come across this post, each node is taking up over a GB of memory. Here’s my docker stats:

NAME                    CPU %   MEM USAGE / LIMIT     MEM %    NET I/O             BLOCK I/O
elasticsearch_master    0.68%   1.233GiB / 7.529GiB   16.38%   32.2MB / 1.22MB     19.4MB / 91.9MB
elasticsearch_master2   0.76%   1.231GiB / 7.529GiB   16.35%   32.9MB / 2.13MB     14.2MB / 91.8MB
elasticsearch_ingest    0.54%   1.208GiB / 7.529GiB   16.05%   32.1MB / 1.14MB     14.5MB / 91.5MB

So, for me, this isn’t a viable solution going forward, because it is taking up too much memory on my current server. I’d still like to figure out the solution to this problem in case someone else can be helped.

My new docker-compose:

version: '2'

services:
  elasticsearch_master:
    image: bitnami/elasticsearch:latest
    container_name: elasticsearch_master
    user: root # This image requires this for persistent storage to work
    environment:
      - ELASTICSEARCH_PLUGINS=ingest-attachment
      - BITNAMI_DEBUG=true
      - ELASTICSEARCH_CLUSTER_NAME=elasticsearch_cluster
      - ELASTICSEARCH_CLUSTER_HOSTS=elasticsearch_master;elasticsearch_ingest;elasticsearch_master2
      - ELASTICSEARCH_NODE_NAME=elasticsearch_master
      - ELASTICSEARCH_MINIMUM_MASTER_NODES=1
      - ELASTICSEARCH_IS_DEDICATED_NODE=no      
      - ELASTICSEARCH_CLUSTER_MASTER_HOSTS=elasticsearch_master;elasticsearch_master2
      - ELASTICSEARCH_TOTAL_NODES=2     
    volumes:
      - ./data_master:/bitnami/elasticsearch/data      
    ports:
      - 9200:9200
    networks:
      BACKBONE:
            
  elasticsearch_master2:
    image: bitnami/elasticsearch:latest
    container_name: elasticsearch_master2
    user: root # This image requires this for persistent storage to work
    environment:
      - ELASTICSEARCH_PLUGINS=ingest-attachment
      - BITNAMI_DEBUG=true
      - ELASTICSEARCH_CLUSTER_NAME=elasticsearch_cluster
      - ELASTICSEARCH_CLUSTER_HOSTS=elasticsearch_master;elasticsearch_master2;elasticsearch_ingest
      - ELASTICSEARCH_NODE_NAME=elasticsearch_master2
      - ELASTICSEARCH_MINIMUM_MASTER_NODES=1
      - ELASTICSEARCH_IS_DEDICATED_NODE=no      
      - ELASTICSEARCH_CLUSTER_MASTER_HOSTS=elasticsearch_master;elasticsearch_master2
      - ELASTICSEARCH_TOTAL_NODES=2     
    volumes:
      - ./data_master2:/bitnami/elasticsearch/data      
    networks:
      BACKBONE:      

  elasticsearch_ingest:      
    image: bitnami/elasticsearch:latest
    container_name: elasticsearch_ingest
    user: root # This image requires this for persistent storage to work
    environment:
      - ELASTICSEARCH_PLUGINS=ingest-attachment
      - BITNAMI_DEBUG=true
      - ELASTICSEARCH_CLUSTER_NAME=elasticsearch_cluster
      - ELASTICSEARCH_CLUSTER_HOSTS=elasticsearch_master;elasticsearch_ingest;elasticsearch_master2
      - ELASTICSEARCH_NODE_NAME=elasticsearch_ingest
#      - ELASTICSEARCH_MINIMUM_MASTER_NODES=1      
      - ELASTICSEARCH_IS_DEDICATED_NODE=yes
      - ELASTICSEARCH_NODE_TYPE=ingest
      - ELASTICSEARCH_CLUSTER_MASTER_HOSTS=elasticsearch_master;elasticsearch_master2
#      - ELASTICSEARCH_TOTAL_NODES=2
    volumes:
      - ./data_ingest:/bitnami/elasticsearch/data
    networks:
      BACKBONE:      

networks:
  BACKBONE:
    external:
      name: BACKBONE_network     

# https://github.com/bitnami/bitnami-docker-elasticsearch

As I said, this docker-file may be configured improperly, but this is how I got it working.

Thanks again for your help! Any idea on the error message?