OCR-processing increases the pdf-files up to 4 - 5 times bigger than before

olliausstuhr · December 16, 2019, 4:59pm

OCR-processing increases the pdf-files up to 4 - 5 times bigger than before. For example: originalsize is 2.3 Mb, OCR-file-size is 10.5 Mb.

Does anybody knows about this problem and perhaps a solution?

R0Wi · October 4, 2024, 8:16pm

Compression is always mandatory when dealing with PDF files. This is why I also switched to ocrmypdf when I developed GitHub - R0Wi-DEV/workflow_ocr: This is a Nextcloud Workflow App which enables you to process files via OCR on serverside.. It handles compression pretty well and most of the time the converted PDF files are even smaller than the original ones.

I know this is not a direct answer to your questions. Anyways I hope it (somehow) helps!