PDF Workflow Not Working for ORC

Mario_user · March 12, 2025, 5:06pm

Support intro

Sorry to hear you’re facing problems.

The community help forum (help.nextcloud.com) is for home and non-enterprise users. Support is provided by other community members on a best effort / “as available” basis. All of those responding are volunteering their time to help you.

If you’re using Nextcloud in a business/critical setting, paid and SLA-based support services can be accessed via portal.nextcloud.com where Nextcloud engineers can help ensure your business keeps running smoothly.

Getting help

In order to help you as efficiently (and quickly!) as possible, please fill in as much of the below requested information as you can.

Before clicking submit: Please check if your query is already addressed via the following resources:

Official documentation (searchable and regularly updated)
How to topics and FAQs
Forum search

(Utilizing these existing resources is typically faster. It also helps reduce the load on our generous volunteers while elevating the signal to noise ratio of the forums otherwise arising from the same queries being posted repeatedly).

The Basics

Nextcloud Server version (e.g., 29.x.x):
- 31.0.0
Operating system and version (e.g., Ubuntu 24.04):
- Ubuntu 22.04
Web server and version (e.g, Apache 2.4.25):
- Apache 2.4.25
Reverse proxy and version _(e.g. nginx 1.27.2)
- none
PHP version (e.g, 8.3):
- PHP 8.2
Is this the first time you’ve seen this error? (Yes / No):
- Yes
When did this problem seem to first start?
- When I upload documents with .pdf extension
Installation method (e.g. AlO, NCP, Bare Metal/Archive, etc.)
- Native installation done by hand without any container
Are you using CloudfIare, mod_security, or similar? (Yes / No)
- No

Summary of the issue you are facing:

I have set up a WorkFlow for my PDF files when they are uploaded into the Nextcloud environment or updated to ensure that there is an OCR scan of the files and therefore make the text searchable directly in Nextcloud via the search functionality. Unfortunately, however, if I do not execute the command

Citazione
occ fulltextsearch:index
Citazione

from cmd the file is not scanned and therefore its content is not searchable in Nextcloud

Steps to replicate it (hint: details matter!):

Settings > Flow > add new OCR flow
Configure parameters for .pdf files and then file MIME type is .pdf
File language set: Italian and English
OCR mode: skip text

Log entries

There are no logs for this failed flow

Apps

Application installed Flow, Full text search, Full text search - Elasticsearch Platform, Full text search - Files, Full text search - Files - Tesseract OCR and Full text search - Bookmarks

Mario_user · March 25, 2025, 11:42am

Hi,
anyone with the same problem, possible?

Please help me solve my problem.

system · June 23, 2025, 11:43am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.