Next Big Feature: Document Management System (DMS)?

Hello everyone,

I would love to see some kind of document management system (DMS) app for Nextcloud! Or is there already such an app?

I think Nextcloud could never be used as a real DMS (look Wikipedia). But you can use now and perhaps in future with new apps Nextcloud to “manage documents”. E.g. you can use today features like group folder, File access control, File drop

Yes, I agree with you @devnull .

What you describe is mainly for data exchange. What I mean is a solution that helps to keep track of all documents in the (sub)folders.

But a “DMS Lite” with limited functionality (capture of all documents, important information like sender/recipient / date / due date / etc.) of a real DMS would be possible (for a skilled programmer) and a great relief for private as well as small commercial users.

Doesn’t anybody need that, except me? :wink:

2 Likes

One option might be to improve the already existing tagging system. With tag clouds for example and smart folders based on tags etc? I am also longing for a better document management system that is not based on folders.

3 Likes

Hi,
this thread seems to be quite new. Maybe somebody is interested in my solution:

Requirements:

  • No folder structure
  • OCR for scanned files
  • Fulltextsearch
  • Automatic tagging
  • Hosting in Germany

Solution:

Infrastructure/Nextcloud settings

  • Nextcloud 20 on VPS Server (hosted in Germany) with root access
  • A lot of space on cloud storage (hosted in Germany)
  • Mounting a folder from cloud storage via Webdav into Nextcloud file directory
  • Create all the tags I need in Nextcloud settings (collaborative tags) - further can be added permanently
  • Installing Fulltextsearch addon in Nextcloud to indexing all files and make them searchable (works automatically with digital created PDF files)

Solution for self scanned files

  • For self scanned files (invoices, printed documents, mailings etc.) I upload the files automatically from my all-in-one device (nearly) directly to my cloud storage in TIFF format (multiple page documents possible)
  • A small bash script runs all 5 minutes and checks for new *.tif files and convert them via tesseract to PDF and delete the TIFF file
  • Run indexing job to index the new PDF to find it via Fulltext-Elasticsearch (the whole content is saved in the index)

Automatic tagging

  • A small php script reads the content of the created index of the new file
  • If a tag (like “car”, “insurance”, etc.) appears in the content the file will be tagged with this one (new added tags are searched also in already tagged files)

With this solution I am able to search inside my documents and also have a classification for a quick overview without opening it or searching :slight_smile:

If you have any question just ask. I’m also looking for further ideas to extend my solution.

Best regards
Peti

5 Likes

Thank you Peti.
I tried to upload scanned document both in PDF (only image), in PDF (using OCR from scanner). Then tried to convert with tesseract. But Fulltext-Elastcsearch didn’t index the resulting files.
Which versions of tesseract and Elasticsearch are you using?
I was thinking to list keywords to match different folders and move documents to different folders at the end of the process.
Marco

@Peti
Sound interesting for me. How have you done that? Maybe we can chat? Maybe you would like to help me with integrating it in my current NextCloud environment?
Regards
Seppos

Hi…
Looks good. Its very interesting way. Maybe you write a short tutorial how you do your steps.
Will be nice to have a app for this in NC.

Greeting
micky1067

A document management app would be of great benefit to our group as we attempt to publish our club meeting minutes from the last ten to fifteen years. Content covers 50-60 categories from several thousand individual document files. We use Nextcloud as an archival store, and as a limited source for publishing via web pages. I am not a Linux admin, so getting Pei’s solution seems daunting if not impossible.

love this. we want to build a database but we don’t want anyone can modify our database.

I am looking for something similar. I want to migrate from OpenKM. However, I could not solve the tagging part. Can you share how yout do the tagging?

Admins can define global tags in
https://cloud.server.tld/settings/admin

The user can go to files and details of the file and add at the top (three dots) the tag.
Then the user can search the tag.

documentation

Vielen Dank für die Info, wo findet man dazu weitere Informationen?
Best Eduard

2 people - same idea at same time.

I scan to PDF via Samba onto my Univention server.
Every 15 minutes a cronjob fetches the PDF-files, generates a searchable PDF using Tesseract and moves it via WebDav into my Nextcloud Docker instance. The source file will be removed then.

Works like a charm.

First of all your scanner must support scanning to a folder via Samba.
You will need to create a Samba-share for access by the scanner.
I used these German documents as basic information:

https://docs.nextcloud.com/server/22/user_manual/de/files/access_webdav.html

A post was split to a new topic: Alfresco and Nextcloud for document management

Hi Peti, thread’s kinda old but I just started looking into this myself. Why do you scan to tiff first? PDFs can have multiple pages. Is there some benefit to using tiff because of what you’re scanning with?

I just started looking at Paperless-ngx. It’s FOSS and I was thinking about moving my NC server onto a Yunohost server, adding Paperless-ngx and setting up a SSO. Might have to figure out how to make NC see the docs in p-ngx or else migrate the docs into the NC folders. Maybe use NC’s external storage feature? Anyone made any progress on this?

This project looks amazing, thanks for sharing.

I could imagine a setup where one would create a folder on the host server to ingest documents. That folder could be made available via SFTP or NFS. One could configure the folder as “consumption” folder for paperless-ngx, as well as external storage in Nextcloud.

This would at least match my workflow, where I scan documents on iOS with the Nextcloud app, directly uploading into a folder.

From how I understand the workings of paperless-ngx, integrating further with Nextcloud wouldn’t provide any benefits. It already comes with a UI and a list of clients as well.

The only thing that needs setup and testing is a backup strategy for the resulting “media” directory and the database.