OK I admit Iām a bit of a cowboy and do try to make things hard at times for myself. Still, if you can prioritise lending a hand over judging and preaching I have what is on the face of it a rather simple problem but which demands one step more expertise or documentation than I have or have found. Which is the point I turn to a form plea or github issue so here I am.
Basically itās a common problem, and looks like this:
Not rocket science. A simple enough error and one reported over and over. Yet, having followed many guides already, read many reports I have not nailed this one. Let me explain, you can even try some of it yourself. The server in question looks fine and seems to work fine.
$ systemctl status loolwsd
ā loolwsd.service - LibreOffice Online WebSocket Daemon
Loaded: loaded (/lib/systemd/system/loolwsd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-09-08 20:51:52 AEST; 15min ago
Main PID: 1487893 (loolwsd)
Tasks: 7 (limit: 18988)
Memory: 106.7M
CGroup: /system.slice/loolwsd.service
āā1487893 /usr/bin/loolwsd --version --o:sys_template_path=/opt/lool/systemplate --o:child_root_path=/opt/lool/child-ro>
āā1487915 /usr/bin/loolforkit --losubpath=lo --systemplate=/opt/lool/systemplate --lotemplate=/opt/collaboraoff>
āā1487917 /usr/bin/loolforkit --losubpath=lo --systemplate=/opt/lool/systemplate --lotemplate=/opt/collaboraoff>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487918 2020-09-08 10:52:05.572178 [ accept_poll ] DBG StreamSocket ctor #21|>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487918 2020-09-08 10:52:05.572261 [ accept_poll ] DBG Accepted socket has fa>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487918 2020-09-08 10:52:05.572295 [ accept_poll ] DBG Accepted client #21| n>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487918 2020-09-08 10:52:05.572323 [ accept_poll ] DBG Inserting socket #21 i>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487918 2020-09-08 10:52:05.572352 [ accept_poll ] DBG #21 Thread affinity se>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572485 [ websrv_poll ] DBG #21 Thread affinity se>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572676 [ websrv_poll ] INF #21: Client HTTP Reque>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572716 [ websrv_poll ] INF Handling request: /loo>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572758 [ websrv_poll ] INF Admin request: /lool/a>
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572784 [ websrv_poll ] INF Admin::handleInitialRe>
It responds fine and you cna check this yourself:
https://cadmus.thumbs.place/hosting/discovery
https://cadmus.thumbs.place/hosting/capabilities
https://cadmus.thumbs.place/loleaflet/dist/admin/admin.html
Even this:
https://cadmus.thumbs.place/lool/adminws
while hanging in the browser, reports credibly fine activity in the collabora debug log when loaded:
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572716 [ websrv_poll ] INF Handling request: /lool/adminws| wsd/LOOLWSD.cpp:2291
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572758 [ websrv_poll ] INF Admin request: /lool/adminws| wsd/LOOLWSD.cpp:2330
Sep 08 20:52:05 nephele loolwsd[1487893]: wsd-1487893-1487919 2020-09-08 10:52:05.572784 [ websrv_poll ] INF Admin::handleInitialRequest bad request| wsd/Admin.cpp:376
bad request I presume as this interface wants some more arguments operating as a web socket.
Now I know Iām taking some risks here because, wait for it, this is not behind Apache, nor Ngninx. In fact it matters not what itās behind, but all the guides provide Apache and Ngninx server configs and none that I have found actually explain WTF they do, or said another way, what Collabora needs. So Iām left guessing a little.
For the most part itās a simple reverse proxy of requests to port 9980 but two cases, often described as the Admins console websocket (tested above) and the Main websocket, the samples seem to add to headers to the request like:
Connection: Upgrade
Upgrade: upgrade
if I read the Nginx configs right, though Apache configs seem not to worry about it. And this is kind of shady territory as Iām not real sure what they are for, and how necessary and for what etc this is. But I do my best to emulate that and have tried a few variants but are also hard to test as I would need collabora to log the full request with headers its received in order to confirm what it sees.
For now though Iām blown away that I can even use the WOPI discovery URL to guess at a URL like:
https://cadmus.thumbs.place/loleaflet/ed4f732/loleaflet.html?Test.ods
And while Iām no pro and clearly have that WOPI URL mangled (am only guessing at what comes after ? in a WOPI URL. But the stunning thing is even that URL opens a clear LibreOffice sheet menu system with one error message about bad WOPI paramaters.
Now slowly youāre getting my drift. Everything I can test so far works just fine. Still, Nextcloud is unhappy. So the question is, āWhat does nextcloud need and how to diagnose that?ā To find out I tried this:
journalctl -fu loolwsd
With the log level on collabora set to debug (trace just floods the log with pulses). Anyhow this watches the log and I can watch it as I do various requests in the browser. In this case I watch it while clicking the Save button on the Collabora Online setup I shared a screenshot of above ā¦ thnking that clicking Save does something that returns with that response: Could not establish connection to the Collabora Online server
.
Only clicking Save
produces precisely zero response in the log, not the collabora log above nor the Nextcloud log. Suggesting that whatever itās doing does not even get as far as collabora. So I check my web sever access and error logs, watching them as I click Save, also no sign of life.
So, we pull out the big guns and sniff the network traffic clicking Save produces. Turns out itās a POST to:
https://mynextcloud.tld/index.php/apps/richdocuments/ajax/admin.php
albeit a cryptic kind of post, with no data posted, and only two headers that seem to carry any consequence a Cookie
and a requesttoken
.
Either way the post returns 500 error and the browser console even graces us with some feedback:
Error: Request failed with status code 500
exports createError.js:16
exports settle.js:17
onreadystatechange xhr.js:61
exports xhr.js:36
exports xhr.js:12
exports dispatchRequest.js:52
promise callback*c.prototype.request Axios.js:61
e Axios.js:86
exports bind.js:9
n AdminSettings.vue:445
c runtime.js:45
_invoke runtime.js:274
t runtime.js:97
j admin.js:450
i admin.js:450
I admin.js:450
I admin.js:450
updateSettings AdminSettings.vue:442
t AdminSettings.vue:431
c runtime.js:45
_invoke runtime.js:274
t runtime.js:97
j admin.js:450
i admin.js:450
I admin.js:450
I admin.js:450
updateServer AdminSettings.vue:428
submit AdminSettings.vue:1
VueJS 3
AdminSettings.vue:437
t AdminSettings.vue:437
c runtime.js:45
_invoke runtime.js:274
t runtime.js:97
j admin.js:450
a admin.js:450
(Async: promise callback)
j admin.js:450
i admin.js:450
I admin.js:450
I admin.js:450
updateServer AdminSettings.vue:428
submit AdminSettings.vue:1
VueJS 3
He
n
_wrapper
I wish there was some greater clue. I kind of wish the message Nextcloud provided was a tad more useful.
As Iām not using Apache or Nginx, my intuition suggests the issue lies in the request preparation and specifically in what these Nginx configs do:
# main websocket
location ~ ^/lool/(.*)/ws$ {
proxy_pass http://localhost:9980;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $http_host;
proxy_read_timeout 36000s;
}
# Admin Console websocket
location ^~ /lool/adminws {
proxy_pass http://localhost:9980;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $http_host;
proxy_read_timeout 36000s;
}
As these are recommended in all the guides. Yet none I have found explain what it is they do, and moreover what it is Collabora and/or Nextcloud need or expect.
I have what seems a fully functional collabora server that the Nexcloud is snubbing and itās not being clear why. Forsooth. Any guidance is appreciated. Iām slowly stuck though can debug the PHP /apps/richdocuments/ajax/admin.php
that is producing the 500 error. Iām hoping thereās a lower effort path ahead with some support than debugging the nextcloud app ā¦ hmmm.
My crystal ball says the PHP is sendinga request to the collabora server and not getting the response it deisres, but without debugging the PHP I canāt see what request itās sending or what expectation it has that is unmet.