Talk issues - Freezing video/connection issues

Nextcloud version (eg, 24.0.1): 24.0.3
Talk Server version (eg, 14.0.2): 14.0.3
Custom Signaling server configured: yes - 33dc5a554b496bd253bedbecc6755de514a64867
Custom TURN server configured: yes - coturn 4.5.1.1
Custom STUN server configured: yes - coturn 4.5.1.1

In case the web version of Nextcloud Talk is involved:
Operating system (eg, Windows/Ubuntu/…): Ubuntu 20.04
Browser name and version (eg, Chrome v101): Chromium 103/Firefox 102

The issue you are facing:

When in a call participant video can freeze randomly, happens more in calls larger then 4 participants. They can still speak but cannot send video, unless they rejoin. They also might only drop out of grid view or cannot enter the room at all.

Is this the first time you’ve seen this error? (Y/N): Has gotten worse the past ± 10 days, I cannot correlate this with any specific change.

Steps to replicate it:

  1. Start a call with more then 4 people
  2. Find some participant video freezes

So there is nothing in the logs to be found, I have not been able to replicate the issue myself. Just see it happen in meetings. I suspect these are users on poor network connections however they use other video call apps without issue.

One thing I have noticed our turn server is no longer using as much data. Talk or the clients seem to decide never to send the feeds through it any more. I have run all the tests on it, I can force using it using the relay command in the browser to test the server connection from within Talk. Then I see the expected data usage, we monitor the historic use in Zabbix. That is how I can see this data usage drop off since June.

We run Coturn on port 443 without ssl, we have the struktur AG / Spreed HPB.

listening-port=443
fingerprint
use-auth-secret
static-auth-secret=oursecrethere
realm=turn.example.com
total-quota=0
bps-capacity=0
stale-nonce
no-multicast-peers

Considering we are doing all the right things (?). I am at a loss and hope someone here has some insight in to this, any thanks in advance.

Edit: I have disabled ipv6 on the turnserver also, could be a issue(?) we have no AAAA record though.

You really need coturn AND high performance server? Are you behind a restrictive proxy / firewall?

Yes, we talked with the HPB supplier about this, I was expecting them to do turn/stun also but they don’t. These are separate required services, many of our users are behind restrictive proxies / firewalls.

both components address different topics:

  • TURN server provides relay in case there is no direct connectivity between endpoints
  • HPB performs “selective forwarding” of the signal, only one media stream is send and received by each client (in opposite to default “mesh” sending, which obviously doesn’t scale well)

for this reason it’s not wrong to use them both (TURN almost always required) - OPs statement the issue happens in calls or calls with 4+ participants points into direction HPB might be useful…

In general WebRTC works in a way clients use peer-to-peer connections whenever possible. for this reason if multiple clients can directly reach each other it’s expected they communicate directly rather through TURN.

are you aware what happened then? any infrastructure change, maybe other user habits (e.g. more office days after home-office phase)…

do you have HPB accessible from the internet? this might be the reason why client don’t need/use the TURN server.

@mvv_vmd you issue is definitely hard to troubleshoot. If my assumptions of your architecture are right I would expect the bottleneck is on the HPB system - either compute or network shortage… but you never know for sure without really hard troubleshooting on every end (e.g. with browser console, fiddler and maybe HPB debug logs) and isolating the the root cause - if the client stops sending data or server stops receiving the stream, or the server stops re-sending the stream to other participants…

Thanks, other then it being summer we have no explanation for the network traffic on our turn server too get so low. I would setup a test call from my mobile carrier network and the office network with my phone and laptop and go trough the turn server. This does not happen anymore, the call does work so hard to tell if this is a good or bad thing.

I’ll check with the HPB host if they see any issue on their end.

I found I do have this issue:

Switched to stun.nextcloud.com:443 which tests good. Could help, again hard to tell.

Do you have a Dump based chrome://webrtc-internals/ for diagnostics or Errors from the Browser DevConsole?

For really good working TURN Server you should have a big dedicated bandwidth. I remember my tests with my own COTURN VServer at Hetzner and for more than three or four concurrent user i had related phenomena. But it does not explain why it works before if no hardware changes.

1 Like

And of course do you have answers for this question from your network and / or firewall admins?

I’ve been troubleshooting using chrome://webrtc-internals/ did not know about that one. It is all looking good. I have configured ipv6 on the turn server now, I noticed some errors concerning that during tests. I think I need to try and get a call to break before I can do more. There is a chance I fixed this issue by now though. I’ll update once I find out more.

I am the network/firewall admin, so that is all good. We have plenty of bandwidth, gigabit on all WAN and LAN links, lots too spare.

1 Like

Kind of hit a wall with this one, issues with video and/or screen shares freezing are still common. Most of the time for one participant who’s network connection is slightly bad. Jitsi does function for these people.

I would love to hear if someone else has these issues, or had them but found a resolution.