Command line tool to find duplicate images

Hello community

i have nextcloudpi and the following problem. On my harddisk which is connected to the raspberry i have a huge amount of images and a lot of them are duplicates but with different file name. I´m looking for a command line tool to find duplicate images. The scanning of the images should analyze the images not the file name.

Is it possible with fdupe, or is this tool only scanning for duplicate files with same file name?

If they ever finish adding in checksums to the database you can search the database for matching checksums and find duplicate files that way.

For now, you would either need to use something like fdupe or some other script. If your filesystem supports deduplication you can also use that.

i have fdupe installed on my Pi. Will it really find duplicate images even if they have a different file name? DupeGuru works fine for me on my winodws laptop. I connect the raspi images folder to my computer and scan - that is working but needs a long time.

I would go directly to my raspi and try searching on the command line, therefore i have installed fdupe…my question is only will it really find duplicate images ?

I’ve used a few of the tools once or twice. If they scan the files and compare the checksums then yes they can find duplicate files.

ok seems that fdupe is the easiest way to find duplicate files and while it takes the md5 checksum it will work correctly.

i think antoher way finding duplicate images can be done on the command line like follows:
find Pictures/ -type f -exec md5sum ‘{}’ ‘;’ | sort | uniq --all-repeated=separate -w 15 > dupes.txt

Finds duplicates by generating and matching an md5sum hash on each file, and then using sort and uniq to print all the photo filenames in a text file, with duplicates listed together and separated by a blank line. It finds only duplicates, and will not count files that are not duplicated

Based on this, I create a script that will find and tag all duplicates for 1 particular user:

1 Like