I am adding pdf documents through a scanning app which is using DAV for uploading the files to nextcloud and further using a workflow_script to ocr the file.
What needs to be done to make fulltextsearch:
- indexing file content (ocr’ed pdf) when adding through DAV
- indexing file content (ocr’ed pdf) when adding through
I assume it has to do with adding a provider but the documentation seems to be incomplete or I just do not understand it. Fulltextsearch/elasticsearch is working fine (including ocr’ed text from pdfs) for existing files and files added through the web UI or desktop sync on NC 27.0.1 for me.
Thank you for help and suggestions!
Sorry i can not really help you.
occ files:scan is not needed if you use Nextcloud with WebDAV.
It is needed if you use something like
I can’t say anything about the subsequent steps regarding OCR.
Yes, I know. The 2 points in my list are meant to be viewed independently (not additive).
Where are those files located ?
Is this an issue with files uploaded into any folder, external filesystem or even at the root of your home directory ?
The pdf files are uploaded to a “regular” (I assume this is what you mean by “any”) folder in my users hierarchy by the app (SwiftScan). The file is recognized by
fulltextsearch:live but not indexing the ocr content.
If I do this with an un-ocr’d pdf the workflow is using
occ files:scan on the new ocr’d pdf file.
Both ways (dav and occ files:scan) do not seem to trigger fulltextsearch to index the pdf ocr information for the search.
Anything I need to configure?