Very experimental Nextcloud cluster-mirrored setup with logical master-master replication using postgres and pglogical extension

Hello everyone, glad to join to Nextcloud community! My name is Simon and i am running very experimental Nextcloud setup with multiple servers in cluster with logical master-master replication using postgres and pglogical extension. Good news - seems beta installation more alive then not. Bad news - i have some difficulties that might be resolved with a power of community (or not). Finally my questions could be about another yet way how to install Nextcloud mirroring.

Lyrics (you can skip this part if you want).

I am working in a small network provider, so we have 100+ Data Centers, low latency channels and another bla-bla staff with opportunity to deploy Nextcloud servers as much closer to end users (my colleagues in company) as possible and organize fast communication between each server in cluster. I understand that for some admins such input data is already some kind of obstacle, but i believe project might be adopted to another infrastructure design.

My friendship with Nextcloud started around 2 months ago (huge respect to developers btw) i am not a guru engineer, but i do whatever i can and, unfortunately, i dont have friends-engineers or colleagues with whom i can consult, so thats why i am here.

My goal
I want (and as i said above it almost works like this) fully functional Nextcloud cluster in 6+ countries with full database and data copy in each. I want to manage one Nextcloud instance and sync changes to others. I want to upload data from any source to the nearest Nextcloud server and sync same data to others. I want to test smth (new app or version for example) at one dev Nextcloud instance and apply stable changes to others and so on.

Some info about current beta installation
Infra:

  1. 6 (for example) virtual machines with private and public network with custom Linux adopted to the company standards.
  2. Docker - Nextcloud, Nginx, Redis, omgwtfssl (just for test period) at each instance.
  3. Postgresql-16 with pglogical extension at localhost (not in Docker yet).
  4. Lsyncd service for data sync (i guess might be any other solution like syncthing and etc.)

Current pipeline in Gitlab rather huge and not a subject of this post, but in general during deploy at the first stage i do dev Nextcloud instance like normal well-known installation in docker, but with connection to postgres at localhost. I config users, settings, design and so on. Then i do sql dump and sync it with Nextcloud data folder to some temp storage. At the second stage i deploy Nextcloud cluster all other the world - i rsync data folder and restore data base from dump. At this stage i also config pglogical subscriptions to be able to sync transactions and resolve conflicts. Also i fix sequences in database and give each server their own pool of integers and bigint to avoid conflicts. Finally i have 6+ working copies of Nextcloud across the world any changes at any Nextcloud will be applied at others less then for 1-10 seconds.

Current problems =) The most interesting part!
I have no doubt that there will be more, but I want to deal with each one in order. At the moment (not counting already open bugs, as far as I understand), I see such types of error in the logs.

  1. spreed | RuntimeException | There can only be one Talk backend
    Error during app service registration: There can only be one Talk backend
    I saw somewhere at this resource info about opened bug refers to this.

  2. no app in context | NotFoundException | File with id “67” has not been found.
    Exception thrown: OCP\Files\NotFoundException
    Seems appears only once straight after installation, currently i ignore it, but will be cool to understand how to fix it.

  3. core | Value type is set to zero (0) in database.
    This is fine only during the upgrade process from 28 to 29.
    I did some tests at might broke smth, error started only yesterday and again i see it only once straight after first deploy.

  4. core | AppConfigUnknownKeyException unknown config key
    Error while running background job OCA\ServerInfo\Jobs\UpdateStorageStats (id: 31, arguments: null)
    Smth that i didn`t investigate yet, but i saw opened issue about it at github.

  5. App “Notes” replaced with “Quick notes” because after fixing sequences with custom pool, but inside integer min\max pool (2147483647) it`s always throw an error that value outside integer pool. Maybe bug?
    This is part of the row from database:
    | integer | 1 | 1 | 2147483647 | 1 | f | 1 |
    I do like notes and will be great if it will work as any other apps in Nextcloud (only notes has an issue)

  6. The most annoying error which i want to fix most of all.
    PHP | file_get_contents(/var/www/html/data/appdata_ocju64rhtz9y/preview/c/5/1/c/e/4/1/13/64-43.jpg): Failed to open stream: No such file or directory at /var/www/html/lib/private/Files/Storage/Local.php#331

Regarding number 6 i have to explain. As far as i understand appdata_ cache is smth very important for Nextcloud itself and users. Without it Nextcloud wont work(?) In addition to files Nextcloud stores info about this cache in database in public.oc_filecache. When i am using one Nextcloud in one country my user generate appdata and filecache. When everything synced to another Nextcloud servers i will see there an error. As i understand this is because those servers will generate similar (with the same path) appdata and public.oc_filecache. I think i can give good example - when i open photos Nextcloud generate previews there, when i try to open same photos at another Nextcloud servers "file not found" error appear in the log even file exists in database and at volume (because it synced) and previews wont work. Linux commands - stat {some file} shows absolutely the same info, databases show absolutely the same info with the same IDs and so on.

So, i am not sure how to deal with it. I tried to exclude appdata from rsync and config pglogical to choose local transaction to win in conflict - such way crash build with internal server error. I think about some row_filtering to exclude transactions with appata_* path, but my knowledge here rather limited and didnt manage how to do this yet (and i dont know if it will help).

Anyway, as far as i see everything works fine even i have errors in log, settings / users / data appears in any part of the cluster and works good without real errors until now.

So, very sorry for so many words, but maybe someone can give me insight how to deal with my errors? Thnx in advance! :slightly_smiling_face:

As i understand i have to attach system report (due to design all servers have the same report)

## Server configuration detail

**Operating system:** Linux 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Tue Jul 2 07:26:33 EDT 2024 x86_64

**Webserver:** nginx/1.27.0 (fpm-fcgi)

**Database:** pgsql PostgreSQL 16.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit

**PHP version:** 8.2.21

Modules loaded: Core, date, libxml, openssl, pcre, sqlite3, zlib, ctype, curl, dom, fileinfo, filter, hash, iconv, json, mbstring, SPL, session, PDO, pdo_sqlite, standard, posix, random, readline, Reflection, Phar, SimpleXML, tokenizer, xml, xmlreader, xmlwriter, mysqlnd, cgi-fcgi, apcu, bcmath, exif, ftp, gd, gmp, imagick, intl, ldap, memcached, pcntl, pdo_mysql, pdo_pgsql, redis, sodium, sysvsem, zip, Zend OPcache

**Nextcloud version:** 29.0.3 - 29.0.3.4

**Updated from an older Nextcloud/ownCloud or fresh install:** 

**Where did you install Nextcloud from:** unknown

<details><summary>Signing status</summary>

[]
</details>

<details><summary>List of activated apps</summary>

Enabled:

  • activity: 2.21.1
  • calendar: 4.7.10
  • circles: 29.0.0-dev
  • cloud_federation_api: 1.12.0
  • comments: 1.19.0
  • contacts: 6.0.0
  • contactsinteraction: 1.10.0
  • dashboard: 7.9.0
  • dav: 1.30.1
  • deck: 1.13.1
  • drawio: 3.0.2
  • federatedfilesharing: 1.19.0
  • federation: 1.19.0
  • files: 2.1.0
  • files_downloadlimit: 2.0.0
  • files_pdfviewer: 2.10.0
  • files_reminders: 1.2.0
  • files_sharing: 1.21.0
  • files_trashbin: 1.19.0
  • files_versions: 1.22.0
  • firstrunwizard: 2.18.0
  • forms: 4.2.4
  • groupfolders: 17.0.1
  • logreader: 2.14.0
  • lookup_server_connector: 1.17.0
  • maps: 1.4.0
  • nextcloud_announcements: 1.18.0
  • notifications: 2.17.0
  • oauth2: 1.17.0
  • password_policy: 1.19.0
  • photos: 2.5.0
  • privacy: 1.13.0
  • provisioning_api: 1.19.0
  • quicknotes: 0.8.23
  • recommendations: 2.1.0
  • related_resources: 1.4.0
  • serverinfo: 1.19.0
  • settings: 1.12.0
  • sharebymail: 1.19.0
  • spreed: 19.0.4
  • support: 1.12.0
  • survey_client: 1.17.0
  • systemtags: 1.19.0
  • tasks: 0.16.0
  • text: 3.10.1
  • theming: 2.4.0
  • twofactor_backupcodes: 1.18.0
  • twofactor_totp: 11.0.0-dev
  • updatenotification: 1.19.1
  • user_status: 1.9.0
  • viewer: 2.3.0
  • weather_status: 1.9.0
  • workflowengine: 2.11.0
    Disabled:
  • admin_audit
  • bruteforcesettings
  • encryption
  • files_external
  • notes
  • previewgenerator
  • suspicious_login
  • user_ldap
</details>

<details><summary>Configuration (config/config.php)</summary>

{
“overwriteprotocol”: “https”,
“memcache.local”: “\OC\Memcache\APCu”,
“apps_paths”: [
{
“path”: “/var/www/html/apps”,
“url”: “/apps”,
“writable”: false
},
{
“path”: “/var/www/html/custom_apps”,
“url”: “/custom_apps”,
“writable”: true
}
],
“memcache.distributed”: “\OC\Memcache\Redis”,
“memcache.locking”: “\OC\Memcache\Redis”,
“redis”: {
“host”: “REMOVED SENSITIVE VALUE”,
“password”: “REMOVED SENSITIVE VALUE”,
“port”: 6379
},
“trusted_proxies”: “REMOVED SENSITIVE VALUE”,
“mail_smtpmode”: “smtp”,
“mail_smtphost”: “REMOVED SENSITIVE VALUE”,
“mail_smtpport”: “465”,
“mail_smtpsecure”: “ssl”,
“mail_smtpauth”: true,
“mail_smtpauthtype”: “LOGIN”,
“mail_smtpname”: “REMOVED SENSITIVE VALUE”,
“mail_from_address”: “REMOVED SENSITIVE VALUE”,
“mail_domain”: “REMOVED SENSITIVE VALUE”,
“mail_smtppassword”: “REMOVED SENSITIVE VALUE”,
“upgrade.disable-web”: true,
“passwordsalt”: “REMOVED SENSITIVE VALUE”,
“secret”: “REMOVED SENSITIVE VALUE”,
“trusted_domains”: [
“localhost”,
“127.0.0.1”,
“here private domain”,
“here public domain”,
“and many other”
],
“datadirectory”: “REMOVED SENSITIVE VALUE”,
“dbtype”: “pgsql”,
“version”: “29.0.3.4”,
“overwrite.cli.url”: “https://nc.some.net”,
“dbname”: “REMOVED SENSITIVE VALUE”,
“dbhost”: “REMOVED SENSITIVE VALUE”,
“dbport”: “”,
“dbtableprefix”: “oc_”,
“dbuser”: “REMOVED SENSITIVE VALUE”,
“dbpassword”: “REMOVED SENSITIVE VALUE”,
“installed”: true,
“instanceid”: “REMOVED SENSITIVE VALUE”,
“allow_local_remote_servers”: true,
“sharing.federation.allowSelfSignedCertificates”: true,
“default_phone_region”: “NL”,
“maintenance_window_start”: 100,
“enable_previews”: “false”,
“maintenance”: false
}

</details>

**Cron Configuration:** Array
(
    [backgroundjobs_mode] => cron
    [lastcron] => 1720674005
)


**External storages:** files_external is disabled

**Encryption:** no

**User-backends:** 
 * OC\User\Database


**Talk configuration:** 

STUN servers
 * no custom server configured

TURN servers
 * no custom server configured

Signaling servers (mode: default):
 * SIP dialin is disabled
 * SIP dialout is disabled
 * no custom server configured

Recording servers:
 * Recording is enabled
 * Recording consent is set to "default"
 * no recording server configured


**Browser:** Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36

hi @simonz.say welcome to the community :handshake:

Thank you for sharing such interesting design and demanding topic… but I want to warn you this forum is intended for a SoHo users and majority of the user base have no clue what you are talking about.

There are huge installations like MagentaCloud from Deutsche Telekom and Bundescloud… I think this was the trigger to create Nextcloud global scale and how it works. I think they can easily help you resolve the issues you hit.

Without going deeper into your design and problems I would recommend you to get in touch with Nextcloud company and negotiate professional consulting. Maybe you can even become a partner gaining advantage on both sides - you get better visibility through Nextcloud partner program and Nextcloud supports you building distributed infrastructure in a right way.

1 Like