Upload document via email + OCR + indexing

Hi !

Not a bug, but I’m facing a blocking situation for my customer’s needs.

The first need is to have OCR and indexing on documents, so I installed workflow_ocr and fulltextsearch linked to an AWS opensearch instance, which is working well.

The second need is to have documents stored in AWS S3 external storages, so I configured these storages with no special difficulty and OCR / indexing is still working well.

But, third need is to be able to send documents via email so that they are directly uploaded in external storages. As I couldn’t find a plugin to do this, that would be suitable for NC22 (the only one I saw was valid until NC18), I took a different way : I use AWS Workmail / SES + S3 event notifications + AWS Lambda function to handle this, and it’s working well, the function puts the documents directly in S3 buckets.

Unfortunately, I realized that documents that were uploaded directly into buckets are not ‘OCRized’ nor indexed, because live indexing doesn’t take this documents into account, and the OCR is working with a Nextcloud flow, which is not triggered in this case.

I’d like to know if anyone has already had a similar need, and how do you think it could be possible to work around this ?

We thought about calling Nextcloud API in the lambda function to upload the file this way instead of putting them directly in S3, but I don’t know if it’s possible and how to handle authentication.

Thanks for your answers !

Fabien