AI and Photo's 2.0 - In-Depth Explanation of Nextcloud Recognize and How it Works

Just for clarification: So the Photos2 app uses the Recognize app and this again uses the dlib for face detection? And these results are displayed in the Photos2 app as “Persons”?

I have two nextclouds installed where we (family) auto upload our pictures from the smartphones. In the past years, we collected quite a mass of pictures here, including the ones we make with professional cameras.

Both nextclouds run the 25 public release now, whereas the bigger one (Intel NUC core i3, Ubuntu 22.04) runs the app facerecognition and the Raspberry Pi4 (Debian bullseye arm64) runs the Recognize app.

The results are worlds apart! The Photos2 app is mostly unusable (also due to bugs and speed), if you add a new person to an existing one, you have to reload all Photos again, otherwise you cannot add another unknown person, the lightbox simply does not appear. The false rate i roughly guessed around 60-70 percent even, where it mixes pictures of my, my wife and our kids into a single person.
The facerecognize app has way less errors in terms of recognition of faces and persons and does not hog the complete system (even if I set the number of CPUs to use to 1 on Recognize).

I really love the approach of having the recognize in the Photos2 app directly, but I think it would make sense to give users the opportunity to adjust a few settings, like it is also possible in Matias Delellis facerecognition, for example cluster confidence, pictures size, thresholds and maybe even the models. As it is right now, Recognize seems to be unusable for at least the face recognition, even though it seems to run the same lib for this job. On a Raspberry Pi, it consumes a lot of computing power over days and weeks with results that are far from satisfaction.
Apart from that, Matias Delellis has a modular approach, where the dlib part can be run on a different machine via docker. This feature greatly speeds up my Raspberry and also my Intel NUC, while I use my editing machine with two Nvidia cards (still working on that cuda compiled dlib), but it makes use of the raw processing power. A thought through approach in many ways, that even produces usable output.

Stil, I am happy to test the next releases and am really looking forward to see a usable AI within nextcloud, on both instances. :slight_smile:

ps: Another speed up approach would be a machine check, if there is cuda installed along with GPUs that support it and make use of a very high speed decoder/encoder/transcoder setup for the video parts by using h264 or h265 _nvenc or -qsv vid-codecs, most ffmpeg installs on Linux OSes are already compiled with the cuda or intel qsv support. Maybe also check if it can be used to speed up the Recognize AI :slight_smile: