Occasional very slow requests when using Talk

kgourlay · August 12, 2025, 5:32am

Nextcloud version (eg, 24.0.1): 31.0.7
Talk Server version (eg, 14.0.2): 21.1.2
Custom Signaling server configured: no
Custom TURN server configured: no
Custom STUN server configured: no

In case the web version of Nextcloud Talk is involved:
Operating system (eg, Windows/Ubuntu/…): Rocky Linux 4.18.0-553.47.1.el8_10.x86_64
Browser name and version (eg, Chrome v101): Safari 18.6 (20621.3.11.11.3)

The issue you are facing:

I am occasionally seeing very slow responses via apache/php-fpm when using the Talk app. This is a relatively new installation, and I’m still learning the best way to configure the server. At first I was getting 503 Service Unavailable errors. I found that if I increased the timeout to php-fpm I got rid of that error, but I noticed in the server logs that some requests are taking a very long time (20 to 30 seconds).

I’m at a loss for debugging any problem. I don’t see any errors in nextcloud.log and I don’t see any particular resource usage on my server, certainly nothing that would be hitting a limit. load average is normal, memory usage is normal, and nothing else seems out of the ordinary except the time it takes for these requests to return. Below is a sample of the only error I’m seeing (courtesy of shield, in my apache access log)

[Mon Aug 11 19:55:50.873223 2025] [shield:notice] [pid 3125101:tid 3125108] [remote ***:51947] Request round-trip 30220 ms exceeds limit 5000 ms. Scoring 15 URI: /ocs/v2.php/apps/spreed/api/v1/chat/*** (proto: HTTP/2.0)
[Mon Aug 11 19:55:52.653632 2025] [shield:notice] [pid 3125101:tid 3125118] [remote ***:51947] Request round-trip 30297 ms exceeds limit 5000 ms. Scoring 15 URI: /ocs/v2.php/apps/spreed/api/v1/chat/*** (proto: HTTP/2.0)

Is this the first time you’ve seen this error? (Y/N): No

Steps to replicate it:

Log in to NextCloud web app or NextCloud Talk iPad app.
Navigate to a Talk conversation.

I appreciate any suggestions on where to look for errors or what settings to try changing.

SysKeeper · August 12, 2025, 5:36am

Hey, so do you observe an actual issue while using Nextcloud/Talk or is your concern about the long requests? Please note that Talk is using long polling for retrieving new messages, so the request runs up to 30s and waits for new messages. That is intended behaviour.

kgourlay · August 12, 2025, 12:43pm

Apologies, maybe there is no actual issue anymore. I originally noticed this behavior because of occasional 503 Service Unavailable errors in the web app. Often these would break the web app and require a page reload. Within the Talk app, they sometimes manifested as a temporary red error box in the corner indicating something couldn’t be loaded.

From the apache error log:

[Mon Aug 11 15:39:52.654554 2025] [proxy:error] [pid 3061878:tid 3061895] (70007)The timeout specified has expired: AH00941: FCGI: failed to acquire connection for (***:8000)

I assumed that if Apache was timing out waiting for a response, there was something going wrong with the script. After I increased the connection timeout in an effort to debug the problem (the <Proxy acquire=n> value), I am no longer seeing 503 errors, but I assumed that the long script execution time was still a problem. If increasing the timeout was an appropriate fix, then maybe everything is fine now.

I am not an expert on long polling, but it seems like I’ve used scripts that employ long polling in the past and I haven’t had this issue. Is it possible that Apache is confusing a long-polling wait with a nonresponsive script? What is an appropriate connection timeout for php-fpm?