Hi there!
Am I blind or isn’t there any live example of how Tesseract can process pixel stuff into text via ocr - implemented in a NC?! Or the other way round: Does anybody know an example we could just test a few files before investing many hours & ressources installing all that?
Thanks, but as I wrote I wanted to see on so’s elses example how it works. Usually there are live examples for add ons, plug ins , scrips etc, or am I mistaken? Is my idea to quickly test something so stupid? Wouldn’t it be a cool thing in the sense of open source to have a free open ocr service somewhere?
Maybe my english is not good enough? I want to save HOURS of installation and configuration. Thats all. Usually time costs more than licenses. I know that tesseract is free and open source.
Maybe there is somewhere in internet an free accessible example of tesseract?
Tesseract itself just needs to be installed along with one or more langugae packs (depending on what languages you want to use). After that just enable it in the full text setup and let Nextcloud build the search index - that’s it. There is nothing else to configure. Tesseract will automatically be used by the full text search to scan images as well if enabled.
Edit: Tesseract itself is just a command line tool which gets an image as parameter and outputs text as a result.
dont get me wrong but following the instructions on the web
~ 16 Steps or less on a default nextcloud machine will take about less than 1/2 hour or @awelzel
the part that consumes time is building the index
(depends on how much files and if you excluded folders from the index)
The excluding part from the index is missing in @awelzel howtos