What is the pruposes of all these 'Full text search' app modules?


#1

Hallo there,
can anyone explain why there are several extension for Full text search, the only search app what can be found in Apps / Categoriess / Search?
I do have troubles to understand the concept behind all those modules:

  1. Full text search - Bookmarks

  2. Full text search - Files

  3. Full text search - Files - Tesseract

  4. Full text search - Elasticsearch Pl

In particular, I don’t understand the purposes of the last one.
Thanks a lot for any clarification. It is really highly appreciated.

Stefan


#2

Hoping this would answer your questions:

https://github.com/nextcloud/fulltextsearch/wiki
https://github.com/nextcloud/fulltextsearch/wiki/How-FullTextSearch-indexes-your-cloud


#3

thanks that help. I read up and down the forum and GitHub but totally overlooked the simple explanation :face_with_symbols_over_mouth:.
Anyway to explain it in brief (again).
The core module Full text search basically makes others know each other. By others, I mean the other two categories to make search work properly.
There is:

  1. Providers Apps
    They extract content from your Nextcloud apps, currently only content fro:

    1. Bookmarks: Full text search - Bookmarks
    2. all files you can access via nextcloud: Full text search - Files
    3. the content of the above mentioned files: Full text search - Files - Tesseract
  2. Platform Apps
    They communicate with a search platform (ie. Elastic Search, Solr, …) in order to index the content provided by the Providers and make them searchable through the plattform search app, what will than be integrated in nextcloud.
    Currently there is only one platform:

    1. Full text search - Elasticsearch Pl

I hope the will be soon provider apps for note taking apps listed in Apps / Categoriess / Office and Draw.io and nextnote.

Could there also be platform apps for 3rd content, such as pocket?

WHY?
I saying and asking this, as I have quite a lot of mini project ongoing but I have to get them structured/organised now eventually.
Currently I’m bit inspired by OneNote’s ideas for note , see below, what could also include photo of business card of craftsmans, their contact and extract from email conversations.
So I think, in the end, when FullTextSearch can index anything (hopefully) in nextcloud and also may some important 3rd party content, it holds also the references to them. So it should be possible do this, what I mentioned before, by the means of nextcould’s apps.

So my vision is that I could use the search function to get also the reference to a content, e.g. mail, photo (by tags) or contact and include them either in a Deck card or note, as real object not just as string of characters.
The search shall be called pressing key-combination/shortcut, so it opens a overlying search window, what will integrate the selected result as object what shows its information inline, what basically should be possible by means of External sites.
You think that is realistic vision?


#4

Hi,
I also thought that this is the purpose of this feature. But I also didn’t find much documentation on this. I got elastic search running, but can’t connect to it.
Have you got it up? It would be nice if you could get me some advice.
Thanks


#5

Not yet as I’m in the middle of setting up my NAS and web applications in a more secure way, what takes me longer than expected.

May @Cult can enlighten us?


#6

Thanks for your answer anyway. I finally got full text search running. OCR isn’t yet running. I’ll work on this the next days.

Anyway, it’s not very stable. Indexing very often crashes and has to be restarted. And it took me a long time to find out, that the settings in NextCloud have to be filled out in every case. I thought, the greyed text is the default value, but that’s not the case.

If you are interested in my steps to get to the current stage I can keep you informed. BTW, I 1st try on a Ubuntu 16.04 system inside a Virtual Box virtual machine (I don’t want to destroy my running server :wink: ).

Happy new year!
Thomas


#7

:+1:

Still in beta but soon is reach first release it will be great😊

Please keep us updated but don’t you want that put in a separate guide, to give user some off the wiki guidance and the devs some ideas where they need to improve the documentation :thinking:

Good luck and don’t get lost between the years :grin:


#8

Which version are you using ?


#9

Office document are files, therefor they are indexed as Office file.

pocket is a sort of bookmark manager ?


#10

Hallo,
@Cult, I was knocked out for the last two days here are my answers.

so anything that is available as a file and not encrypted is readable by fulltextsearch. When some creates an app what stores in a database it is a different story isn’t it? In case of encrypted files, fullstextsearch would still read their metadata? [/intend]
.

yes same was e.g. https://wallabag.org/en
.
@SchroedersKater

do you refer to my first

or second section (the vision)

.

as sidenote

I still running on old Synology NAS, thinking about to switch to either Rock PI64 or Udoo X86 and Docker images.


#11

Sick or drunk ? either way, get better.

Technically, no. :smiley:

fulltextsearch does not get access to your files. files_fulltextsearch does.
files_fulltextsearch is defined as a content provider and feed the fulltextsearch with the content of your files that will be sent to fulltextsearch_elasticsearch for indexing.

Which means that yes: files_fulltextsearch will get any content from unencrypted files, so any app that generate file(s) available to the Nextcloud Files will see their content indexed (based on the mimetype/format).

Which also means that yes: apps that store content in database won’t have their content indexed by files_fulltextsearch,

Which finally means that if your app generate contents that are not stored in files, you will need to provide a content provider within your app if you want the content to be indexed by fulltextsearch.

If by metadata, we’re talking name of file, sharing rights, …, when available they are indexed and used during the search.
If your talking about the metadata app, this should be doable by making a small extension of files_fulltextsearch. I know someone started working on this but fulltextsearch was still alpha by the time and the tools were missing. But, with today’s API, you should be able to do something within few hours.

The files_fulltextsearch provide some hooks when indexing files/search for content that allow any 3rd party app to ‘complete’ the indexing process with its own data (ie. the files_fulltextsearch_tesseract app that will OCR your files and help fill the content to be indexed during the process)

If an app can get the content, it can index it.


#12

one led to the another but that is off-topic :sweat_smile: but now I’m back to work :scream_cat:
.
.
by

you mean anything that I can find in any folder in Nextcloud what is accessible via the file browser of Nextcloud 's web interface, depending on the access rights of my user account (let’s call it the File App) .

I refer to all what is visible in details of a folder and file (tags, comments, sharing, versions)


.
.

what do you mean by available?
.
.

that would be awesome, if metadata app is also incorporated
.
.

so there must be an Nextcloud App what has access to the 3rd party content?
It has to make its content available to the 3rdpartry_fulltextsearch content provider, what has to created as well (by you?)
How to make the content available?

A different option would to make us of elasticsearch integration plugins, doesn’t it?
.

What about my vision?


#13

Sorry for answering that late. I was sick and still am not yet very well again.

Full text search: 1.2.3
… - Elasticsearch Platform: 1.2.2
… - Files: 1.2.3
NextCloud: 15.0.0


#14

@Cult, what do you think?


#15

Can you resume your vision/questions ?


#16

I’ll try :wink:
but I’m afraid that

you need to answer the following three question first:

  • access content of 3rd party applications

.
.

  • make the content of 3rd party applications available

I assume you mean that has to be dedicated provides app what in turn access either

  1. the 3rd party application (straight connection)

or

  1. makes the connection to the Nextcloud app, which accesses the 3rd party application, and thus makes the 3rd party content available to the dedicated provider app (relayed connection)

.
.

  • what is indexed by files_fulltextsearch

.
.
Afterwards, I can edit my original text below to match with the latest information


#17

hope above isn’t too much text at once :worried:


#18

I will write some documentation on how to create FTS app module for NC15


#19

I’ll wait and see what is still unanswered :slight_smile:

Le mar. 5 févr. 2019 à 15:24, Cult via Nextcloud community noreply@nextcloud.com a écrit :


#20

@PackElend please have a look to this devblog, you might find some of your answers:

https://daita.github.io/fulltextsearch-and-deck/