What is the pruposes of all these 'Full text search' app modules?

PackElend · December 29, 2018, 9:12pm

Hallo there,
can anyone explain why there are several extension for Full text search, the only search app what can be found in Apps / Categoriess / Search?
I do have troubles to understand the concept behind all those modules:

In particular, I don’t understand the purposes of the last one.
Thanks a lot for any clarification. It is really highly appreciated.

Stefan

Cult · December 29, 2018, 9:28pm

Hoping this would answer your questions:

https://github.com/nextcloud/fulltextsearch/wiki
https://github.com/nextcloud/fulltextsearch/wiki/How-FullTextSearch-indexes-your-cloud

PackElend · December 30, 2018, 1:24pm

thanks that help. I read up and down the forum and GitHub but totally overlooked the simple explanation .
Anyway to explain it in brief (again).
The core module Full text search basically makes others know each other. By others, I mean the other two categories to make search work properly.
There is:

Providers Apps
They extract content from your Nextcloud apps, currently only content fro:
1. Bookmarks: Full text search - Bookmarks
2. all files you can access via nextcloud: Full text search - Files
3. the content of the above mentioned files: Full text search - Files - Tesseract
Platform Apps
They communicate with a search platform (ie. Elastic Search, Solr, …) in order to index the content provided by the Providers and make them searchable through the plattform search app, what will than be integrated in nextcloud.
Currently there is only one platform:
1. Full text search - Elasticsearch Pl

I hope the will be soon provider apps for note taking apps listed in Apps / Categoriess / Office and Draw.io and nextnote.

Could there also be platform apps for 3rd content, such as pocket?

WHY?
I saying and asking this, as I have quite a lot of mini project ongoing but I have to get them structured/organised now eventually.
Currently I’m bit inspired by OneNote’s ideas for note , see below, what could also include photo of business card of craftsmans, their contact and extract from email conversations.
So I think, in the end, when FullTextSearch can index anything (hopefully) in nextcloud and also may some important 3rd party content, it holds also the references to them. So it should be possible do this, what I mentioned before, by the means of nextcould’s apps.

So my vision is that I could use the search function to get also the reference to a content, e.g. mail, photo (by tags) or contact and include them either in a Deck card or note, as real object not just as string of characters.
The search shall be called pressing key-combination/shortcut, so it opens a overlying search window, what will integrate the selected result as object what shows its information inline, what basically should be possible by means of External sites.
You think that is realistic vision?

SchroedersKater · December 30, 2018, 5:24pm

Hi,
I also thought that this is the purpose of this feature. But I also didn’t find much documentation on this. I got elastic search running, but can’t connect to it.
Have you got it up? It would be nice if you could get me some advice.
Thanks

PackElend · December 31, 2018, 1:56pm

Not yet as I’m in the middle of setting up my NAS and web applications in a more secure way, what takes me longer than expected.

May @Cult can enlighten us?

SchroedersKater · December 31, 2018, 6:15pm

Thanks for your answer anyway. I finally got full text search running. OCR isn’t yet running. I’ll work on this the next days.

Anyway, it’s not very stable. Indexing very often crashes and has to be restarted. And it took me a long time to find out, that the settings in NextCloud have to be filled out in every case. I thought, the greyed text is the default value, but that’s not the case.

If you are interested in my steps to get to the current stage I can keep you informed. BTW, I 1st try on a Ubuntu 16.04 system inside a Virtual Box virtual machine (I don’t want to destroy my running server ).

Happy new year!
Thomas

PackElend · December 31, 2018, 8:19pm

Still in beta but soon is reach first release it will be great😊

Please keep us updated but don’t you want that put in a separate guide, to give user some off the wiki guidance and the devs some ideas where they need to improve the documentation

Good luck and don’t get lost between the years

Cult · December 31, 2018, 8:38pm

Which version are you using ?

Cult · December 31, 2018, 8:46pm

Office document are files, therefor they are indexed as Office file.

pocket is a sort of bookmark manager ?

PackElend · January 4, 2019, 9:45pm

Hallo,
@Cult, I was knocked out for the last two days here are my answers.

so anything that is available as a file and not encrypted is readable by fulltextsearch. When some creates an app what stores in a database it is a different story isn’t it? In case of encrypted files, fullstextsearch would still read their metadata? [/intend]
.

yes same was e.g. https://wallabag.org/en
.
@SchroedersKater

do you refer to my first

or second section (the vision)

.

as sidenote

I still running on old Synology NAS, thinking about to switch to either Rock PI64 or Udoo X86 and Docker images.

Cult · January 4, 2019, 10:01pm

Sick or drunk ? either way, get better.

Technically, no.

fulltextsearch does not get access to your files. files_fulltextsearch does.
files_fulltextsearch is defined as a content provider and feed the fulltextsearch with the content of your files that will be sent to fulltextsearch_elasticsearch for indexing.

Which means that yes: files_fulltextsearch will get any content from unencrypted files, so any app that generate file(s) available to the Nextcloud Files will see their content indexed (based on the mimetype/format).

Which also means that yes: apps that store content in database won’t have their content indexed by files_fulltextsearch,

Which finally means that if your app generate contents that are not stored in files, you will need to provide a content provider within your app if you want the content to be indexed by fulltextsearch.

If by metadata, we’re talking name of file, sharing rights, …, when available they are indexed and used during the search.
If your talking about the metadata app, this should be doable by making a small extension of files_fulltextsearch. I know someone started working on this but fulltextsearch was still alpha by the time and the tools were missing. But, with today’s API, you should be able to do something within few hours.

The files_fulltextsearch provide some hooks when indexing files/search for content that allow any 3rd party app to ‘complete’ the indexing process with its own data (ie. the files_fulltextsearch_tesseract app that will OCR your files and help fill the content to be indexed during the process)

If an app can get the content, it can index it.

PackElend · January 8, 2019, 8:22pm

one led to the another but that is off-topic but now I’m back to work
.
.
by

you mean anything that I can find in any folder in Nextcloud what is accessible via the file browser of Nextcloud 's web interface, depending on the access rights of my user account (let’s call it the File App) .

I refer to all what is visible in details of a folder and file (tags, comments, sharing, versions)

.
.

what do you mean by available?
.
.

that would be awesome, if metadata app is also incorporated
.
.

so there must be an Nextcloud App what has access to the 3rd party content?
It has to make its content available to the 3rdpartry_fulltextsearch content provider, what has to created as well (by you?)
How to make the content available?

A different option would to make us of elasticsearch integration plugins, doesn’t it?
.

What about my vision?

SchroedersKater · January 10, 2019, 5:07pm

Sorry for answering that late. I was sick and still am not yet very well again.

Full text search: 1.2.3
… - Elasticsearch Platform: 1.2.2
… - Files: 1.2.3
NextCloud: 15.0.0

PackElend · February 2, 2019, 10:32am

@Cult, what do you think?

Cult · February 2, 2019, 10:50am

Can you resume your vision/questions ?

PackElend · February 2, 2019, 11:35am

I’ll try
but I’m afraid that

you need to answer the following three question first:

access content of 3rd party applications

.
.

make the content of 3rd party applications available

I assume you mean that has to be dedicated provides app what in turn access either

the 3rd party application (straight connection)

or

makes the connection to the Nextcloud app, which accesses the 3rd party application, and thus makes the 3rd party content available to the dedicated provider app (relayed connection)

.
.

what is indexed by files_fulltextsearch

.
.
Afterwards, I can edit my original text below to match with the latest information

PackElend:

WHY?
I saying and asking this, as I have quite a lot of mini project ongoing but I have to get them structured/organised now eventually.
Currently I’m bit inspired by OneNote’s ideas for note , see below, what could also include photo of business card of craftsmans, their contact and extract from email conversations.
So I think, in the end, when FullTextSearch can index anything (hopefully) in nextcloud and also may some important 3rd party content, it holds also the references to them. So it should be possible do this, what I mentioned before, by the means of nextcould’s apps.

So my vision is that I could use the search function to get also the reference to a content, e.g. mail, photo (by tags) or contact and include them either in a Deck card or note, as real object not just as string of characters.
The search shall be called pressing key-combination/shortcut, so it opens a overlying search window, what will integrate the selected result as object what shows its information inline, what basically should be possible by means of External sites.
You think that is realistic vision?

PackElend · February 2, 2019, 12:53pm

hope above isn’t too much text at once

Cult · February 5, 2019, 2:14pm

I will write some documentation on how to create FTS app module for NC15

PackElend · February 5, 2019, 4:41pm

I’ll wait and see what is still unanswered

Le mar. 5 févr. 2019 à 15:24, Cult via Nextcloud community noreply@nextcloud.com a écrit :

Cult · February 17, 2019, 11:45am

@PackElend please have a look to this devblog, you might find some of your answers:

https://daita.github.io/fulltextsearch-and-deck/