Is there a way to find duplicate documents?

In different folders, documents with different filename but same content.

Thank you!

I don’t think there’s a built in way in Nextcloud (feel free to write an app though!). I use fdupes in the terminal for accomplishing this, it will compare file hashes.

I will start storing an hash of the content and see if I can do some search on duplicate hash in the elasticsearch; then I will see if I can find almost identical content; but this is not in the near future

2 Likes

I use FSlint on Linux box to tidy up my old media server (tons of family baby video with serious duplication), very powerful.

hi, sorry to dig this out, just want to verify this:
the nextcloud file database does not store hashes but “only” timestamps?

greetings

Old thread, but maybe people want to see the solution now and this is also found via search engines, so:

  • there is MediaDC nowadays, which has quite much features, but depends on some stuff (mostly Python), also it was featured in the blog and can find not-exact duplicates (similar images and videos)
  • Simple and lesser known seems to be the duplicate finder that just seems to compare the exact file hash

Apart from that there are many threads here that are duplicates and ask for the same: