Hi,
I am adding pdf documents through a scanning app which is using DAV for uploading the files to nextcloud and further using a workflow_script to ocr the file.
What needs to be done to make fulltextsearch:
indexing file content (ocr’ed pdf) when adding through DAV
indexing file content (ocr’ed pdf) when adding through occ files:scan
I assume it has to do with adding a provider but the documentation seems to be incomplete or I just do not understand it. Fulltextsearch/elasticsearch is working fine (including ocr’ed text from pdfs) for existing files and files added through the web UI or desktop sync on NC 27.0.1 for me.
The pdf files are uploaded to a “regular” (I assume this is what you mean by “any”) folder in my users hierarchy by the app (SwiftScan). The file is recognized by fulltextsearch:live but not indexing the ocr content.
If I do this with an un-ocr’d pdf the workflow is using ocrmypdf and occ files:scan on the new ocr’d pdf file.
Both ways (dav and occ files:scan) do not seem to trigger fulltextsearch to index the pdf ocr information for the search.