... will not be accessible due to incompatible encoding

I am having problems with my nextcloud installation. After doing a filescan, all my data with German Umlaute (äöü) vanished. The error message just is:
“Entry “/Timo/files/Universität/ComputationalPhysics/Python für Physiker/Einführung in die Programmiertechniken (Prof. Dr. L. Pollet - SoSe 2017)/Übungsblätter” will not be accessible due to incompatible encoding”.
I remember that I even uploaded the data using Nextcloud on the Browser. I am hosting Nextcloud on a MacMini using a VM with Ubuntu 20.04 LTS.
Does anyone know how to solve that?

1 Like

:wink: Nice. Sollte man endlich mal abschaffen.
https://en.wikipedia.org/wiki/Germanic_umlaut

Perhaps you find here a solution:
https://github.com/nextcloud/server/issues/3136

convmv -f utf-8 -t utf-8 -r --notest --nfc <nextcloud-data-folder>
Use a test folder.

3 Likes

Echt so :smiley:!
Thats the strange part. My data is already utf-8 and convmv did not help unfortunately…

Ok i can send you two commands for converting file content (not file name).
locale is on this linux system utf-8 but my editor (vim) destroys it for windows systems.
Perhaps file names work similar bad with correct utf-8.

utf-8 to windows
iconv -f UTF-8 -t WINDOWS-1252 $1 > $2
windows to utf-8
iconv -f ISO-8859-1 -t UTF-8 $1 > $2

Dear Community,
I have the same problem. Everything has worked fine so far. The files are stored via Nextcloud in the data directory with german umlauts (äöü). I can edit and access everything properly. Only the command occ files:scan brings the error message “will not be accessible due to incompatible encoding”. The filesystem is utf8 and the database as well:

[client]
default-character-set = utf8mb4

[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_general_ci
transaction_isolation = READ-COMMITTED
binlog_format = ROW
innodb_large_prefix=on
innodb_file_format=barracuda
innodb_file_per_table=1

the php-ini:
default_charset = "UTF-8

I use a vserver under ubuntu 16.04 lts with php 7.3 and apache.

I don’t know what to do. But I need the scan. I have imported a backup of data and wanted to scan the files.

I have executed the following commands. Unfortunately without positive result:

convmv -f utf-8 -t utf-8 -r --notest --nfc 
php occ files:scan --all

Thank you very much
Christian

2 Likes

Hi Christian,

I have the same problem here and I digged a bit deeper. Just as an example, I took a file with the name “Hände.doc” (storage backend is a Windows machine)

on the Linux side (were NC is running) I did:

strace ls H*.doc
[…]
stat("Ha\314\210nde.doc", {st_mode=S_IFREG|0770, st_size=75340, ...}) = 0

Here I can clearly see the raw encoding: “H” + “a”, followed by 0xCC88 (U+0308, COMBINING DIAERESIS) .

=> I conclude that the OS / filesystem is clearly using UTF-8 already. This explains why all attempts to “convert” the file names to UTF-8 have no effect. So it looks like a bug in the application to me.

I guess it is the code in ‘Scan.php’ which does a simple compare of the original path with the “normalized” path. Maybe this causes the trouble, because the character “‘ä’” could also be encoded directly, as 0xC3A4 ? (just a guess)

BTW: I also get tons of similar messages from other files with foreign encoding, for example with greek, chinese or spanish characters…

Do I see this right that all this is only a “warning”, which has no other consequences?

regards,
Thomas

I have the same problem: I ran occ files:scan and the file “Sven Väth_Essential Mix.mp3” will not be accessible due to incompatible encoding as well as all many other files having ä, ö, å or any other non-ASCII letters in the file name.

The file is neither visible in the web interface nor is it synced via the nextcloud client.

So no, this is NOT just a warning.

BUT weirdly enough most files with non-ASCII letter are accepted. I’ll try the strace method proposed by @dg6nee and report back.

Okay, I found two comparable files:

The first has incompatible encoding according to the readout.

Entry "/Elias Doré/Elias Doré @ Rebellion der Träumer, 07.05.16_263237326 - Elias Doré.mp3" will not be accessible due to incompatible encoding

strace shows

lstat64("Elias Dore\314\201 @ Rebellion der Tra\314\210umer, 07.05.16_263237326 - Elias Dor\303\251.mp3", {st_mode=S_IFREG|0777, st_size=178917458, ...}) = 0

But this one throws no error:

File /Elias Doré/Bucht der Träumer 2019 - Atlantis, Friday Night_668487995 - Elias Doré.mp3

lstat64("Bucht der Tr\303\244umer 2019 - Atlantis, Friday Night_668487995 - Elias Dor\303\251.mp3", {st_mode=S_IFREG|0777, st_size=169984834, ...}) = 0

Can anyone read this fluently? I have very little experience with weird encoding errors.

I use rsync to backup my lightroom classic photos from my mac to nextcloud and make them accessible. So I can’t run the convmv command unless I want to run it every time I sync my files. The answer for me is in that great github link you shared:

Can you try doing the encoding conversion while copying (rsync -a --iconv=utf-8-mac,utf-8 ...)

However, my mac has an older version of rsync that does not support iconv. I will have to upgrade rsync according to internet searches. Before I try this, has anyone else had success with this rsync command?

For now, I am going to stick with renaming my files in lightroom to remove the umlauts. There are not so many of them.