Reliable pipeline to import contacts from Google?

I read a few questions here about bugs related to importing contacts from Google, e.g. this one. But a lot of those threads are fairly old, and my issue isn’t with a specific bug, but rather, I’m looking for a reliable pipeline for getting my contacts out of Google and into the Nextcloud app.

I’ve tried several approaches, as I explain below, but all of them have silently errored in ways that make it hard for me to trust them. If the import process fails on a few contacts and requires me to manually make corrections, that’s fine, but I would like to be alerted about which data failed to parse, rather than discover 6 months from now that I don’t have someone’s address where I thought I did.

Note that Google contacts provides 3 export formats:

  • Google CSV
  • Outlook CSV
  • vCard (aka iCal, .vcf)

Attempt 1

I exported vCard from Google, then imported this file using the Nextcloud Contacts web interface.

Result: Many silent errors. I had n contacts in Google, but Nextcloud showed only 0.9n contacts after import. While importing, Contacts said there were “13 import errors,” but did not name which contacts errored, and the true number of lost contacts was much greater than 13. Later on, I was able to identify a few of the missing contacts by spot checking, but this is a bit of a hassle, and even for the contacts that did import there were still some missing fields.

Attempt 2

I exported Google CSV from Google, then imported this file into the Thunderbird desktop app on Linux. Thunderbird syncs to my Nextcloud with TBSync, but I imported them into a scratch address book to clean them up before uploading.

Result: Abject failure. Thunderbird basically fails to parse the CSV from Google. There is a screen where you can manually map the CSV columns to Thunderbird’s information fields, but the individual cells of the CSV themselves contain junk text. For example, one cell, parsed as a contact’s Birthday, read something like Address 1: 1234 - Address 2: 5678 - . Basically, the CSV from Google itself is too much of a mess to salvage.

Attempt 3

I exported vCard from Google and imported this into Thunderbird.

Result: Many silent errors. All n contacts are imported, but lots of fields are dropped. In particular, Google lets you have an arbitrary number of addresses, whereas Thunderbird only accepts a Private and Work address. In practice, though, Thunderbird didn’t read in any work addresses at all and just plopped the first address listed in Google into the Private address field, sometimes with missing fields.

In conclusion

I’m OK with doing a little bit of wrangling for a couple of contacts that have truly strange names/addresses, but after the 3 failed attempts above, I wanted to ask if anyone on this forum has successfully imported a large contact database from Google and what data pipeline gave the fewest, and noisiest, errors.

Versions

  • Thunderbird v91.11.0
  • TBSync v3.0.2
  • Nextcloud v24.0.3
  • Nextcloud Contacts v4.2.0

Hello,

Little off Topic, but I was wondering, did you try the Google Contact Import app?

If you may notice, there is an app in NextCloud app store for the said job → Google integration - Apps - App Store - Nextcloud

What’s the result there?

Thanks.

Not off topic at all! I’m reluctant to try it since the app is untested, but I checked out the source code a little bit and at the very least, it appears that it checks for multiple addresses, which is something that Thunderbird does not do: integration_google/GoogleContactsAPIService.php at master · nextcloud/integration_google · GitHub

But I would prefer a solution that uses just the .vcf file, since I can confirm (by inspecting it in a text editor) that this file has all the information I need (incl. multiple addresses) and the issue is with Thunderbird/Nextcloud’s .vcf parser. This is especially perplexing because, as far as I can tell from the source code above, Nextcloud contacts itself stores data internally as a .vcf, so its parser should be able to report errors with the format…

This is supposed to be the tool, as mentioned above. You could also the Google Takeout tool; it allows you to pick and choose what data to export from Google.

I get that the Google integration app is designed for this purpose, but the app is still untested, and I’d like to know if there are any better/other options.

Google Takeout, as far as I can tell, is just a wrapper around the contacts export feature discussed in my OP. The issue isn’t with getting the data out of Google, it’s with tools to parse this data into Nextcloud.

Recently I have been experimenting with the KDE KAddressBook program, which syncs seamlessly with WebDAV, but also has problems parsing the .vcf files and has no CSV import option AFAICT.

Edit: I tried the Google Integration app and doing requires a pretty extensive setup. You have to create a project on Google Cloud and create OAuth access keys that you give to the app. This is awesome if you want all the other services you can get through Google Integration, but for a one time import of contacts it’s probably overkill. I didn’t bother going through with the whole process, so I can’t speak to the quality of the data import.

The developers of the Nextcloud Contacts app don’t owe me anything, but I really think a better .vcf parser would help a lot with adoption among people who don’t want to set a whole OAuth interface.

After inspecting the logs, I discovered the main source of the import errors, which is that some of my contacts had birthdays without years. Google contacts allows you to do this, but the internal representation in the .vcf is NaN-01-01 which causes Nextcloud Contacts to panic on import. This is a known issue, and apparently a regression from previous versions of the app:

After correcting this issue, I was able to import all of my contacts in some form or another, however, there were still many parsing errors that appeared on inspection. For example, addresses parsing as „1234 Street City, ST" (with a German-style quotation mark at the beginning), or sometimes being broken into constituent fields (Line 1, City, State etc.) and otherwise not… This isn’t exclusively Nextcloud’s fault; it’s the combination of Google’s messy .vcf export and Nextcloud’s handwritten parser, but odds are that if you have a sufficiently well-aged Google contacts database you will run into similar issues.

I’ve decided to hold off on using the Contacts app in Nextcloud for now, because of the bugs above, and missing essential functionality such as batch editing/deleting, merging, setting the “File As” name (aka display name), etc. Some of this functionality can be replaced if you sync your contacts to desktop app like Thunderbird, but this introduces its own issues, e.g. the lack of support for multiple addresses I mentioned in the OP.

There are lots of open issues on Github about this, including some with bounties e.g.

Since there are lots of technical issues in the way of batch editing, CSV import functionality would also be helpful—then I could edit Google’s CSV export in a spreadsheet editor and strip out all their weird escape characters and stuff. This is also an old open issue:

In case there’s any misunderstanding, I’m not mad, and don’t mean to come across as whining. I just wanted to factually summarize the problems I’ve run into to prevent others from wasting their time if you need any of the features above.

2 Likes

This is the only thing you’ve mentioned that confuses me. Not sure why you consider it untested since users are using it. Either way why not try it? Give it a shot. Thanks for linking the explicit app issues you’ve encountered!

I was talking about the red “Enable untested app” warning you get when you search for some apps in the app installation area. But it appears I was mistaken: There is a Google drive app that is untested, but the main Google integration app doesn’t have this warning after all.

“Enable untested app” is simply the phrase chose for the appstore. It doesn’t actually mean anything explicitly besides that the app compatibility hasn’t been changed from 23 to 24 in a text file.

If worried, the only place to learn anything is the app repo. If the app is truly broken it will not enable period. The warning is simply a warning. Keep backups of your nc and you will be fine.

1 Like