[Bug] Various vcf file encoding problem

Hello all,

I am running my anual christmascard cycle again using my nextcloud conctact database.
I noticed this year there are quite a bit of changes in nextcloud. Most of them improvements, but I stumbled upon a couple of setbacks that I want to share before running into bug reporting mode.

#1 Some contact were reported corrupt. but when I tried to fix them (with the popup button at the top) I got a spinning circle that ran forever, eventually crashing my browser. I read in a post about the workaround of opening and saving them in Thunderbird. That worked. one by one I fixed a handfull out of 98 addresses.

#2 When I downloaded the christmascard category for post-processing (I run a python script to set all the LABEL fields wit the correctly formatted postal addres for adress stickers) I ran into an error.
Apperantly the vcfs lack a newline between contacts, resulting in lines with
END:VCARDBEGIN:VCARD
which my script unfortunately couldn’t handle. I had to do some regexp work in emacs to fix those:-)

#3 I think the encoding of the characters changed. Last year I could read the international characters in a console : François. This years version is Fran�ois
I assume this was a change for the better. But it broke my fgrep counting as well. Took me quite some time to realise that I didn’t miss contacts, but they were simply not counted by fgrep because it skips ‘binary’ lines unless you give the -a option.
If someone can give me the background on this format change, I’d be interested though.

I’m now at the stage where I have all my addresses in correctly formated vcf file with the right LABEL field. But I wanted to share my experience. At least #2 is a real bug I think. #1 is a bug, but I read it is being investigated. #3 is an encoding side effect I assume.

Btw, I would still be interested in automatic filling of the ADR LABEL field for formatted addresses.
This should be filled based on the local address format custom so you can print the label on envelopes for mailings. I have an python script now to do that. But it would be great if that could be integrated in nextcloud contacts.

Ah almost forgot. I upgraded to Nexcloud 17.0.1 while running into the above problems, hoping it would solve part of it. so all I have written is based on that version.

Kind regards,

Bert

Hi all,

Short follow up. I noticed that #3 is more severe implications for me. When importing the contacts in glabels, the characters aren’t shown correctly as well. Just like on the terminal prompt. Emacs shows them right though.

What encoding is used here?

Bert

Afaik, Nextcloud stores all address data in the way it gets imported. So if you use e.g. an external source, like your smartphone, it takes the data without changing or checking it. Only if you access a record using the gui, a check is done and the syntax can fixed somehow.

Hi J-ed

I have checked in emacs and the encoding is ISO-latin-1, also known as ISO 8859-1.
I managed to convert the vcf file with iconv. And the file looks good now. But I have no idea why/how it got converted:-(
I’m pretty sure some of the contacts have not been touched since last year. I only use thunderbird, the nextcloud web interface any my android phone to connect to the contacts. Which one could be the culprit?

Any suggestion is welcome.

Bert

Nowadays all mobile phones and desktops should work with an UTF-8 encoding. Have you eventually used a Win7 desktop PC? Would it be possible that the bigint/utf-8 migration of the database tables caused the problem?

Hi J-ed,

Nope, No windows 7. I do have thunderbird on my Windows 10 laptop from work. though. that comes closest that I can think of.
I guess I will download all contacts, delete them online an upload them again as utf8, to start afresh.

Thanks for your feedback. Can you do me a little favor though. can you check my point #2? I’d like to have a confirmation from someone else before issuing a bug report.
In my case if I download a category of contacts or all contacts in a single vcf file
and do a fgrep BEGIN Contacts.vcf
I see there is a newline missing between cards. So there are a lot of
END:VCARDBEGIN:VCARD
lines.
I am wondering if you see the same thing.

Regards,

Bert

Yes, I can confirm the described behavior with my contacts too.

> grep "END:VCARD" export.vcf
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARDBEGIN:VCARD
END:VCARD

Thanks for the confirmation!
I filed a bug report.

Regards,

Bert

1 Like

Closing, please see github issue