Talk issues - Freezing video/connection issues

mvv_vmd · September 6, 2022, 7:44am

Nextcloud version (eg, 24.0.1): 24.0.3
Talk Server version (eg, 14.0.2): 14.0.3
Custom Signaling server configured: yes - 33dc5a554b496bd253bedbecc6755de514a64867
Custom TURN server configured: yes - coturn 4.5.1.1
Custom STUN server configured: yes - coturn 4.5.1.1

In case the web version of Nextcloud Talk is involved:
Operating system (eg, Windows/Ubuntu/…): Ubuntu 20.04
Browser name and version (eg, Chrome v101): Chromium 103/Firefox 102

The issue you are facing:

When in a call participant video can freeze randomly, happens more in calls larger then 4 participants. They can still speak but cannot send video, unless they rejoin. They also might only drop out of grid view or cannot enter the room at all.

Is this the first time you’ve seen this error? (Y/N): Has gotten worse the past ± 10 days, I cannot correlate this with any specific change.

Steps to replicate it:

Start a call with more then 4 people
Find some participant video freezes

So there is nothing in the logs to be found, I have not been able to replicate the issue myself. Just see it happen in meetings. I suspect these are users on poor network connections however they use other video call apps without issue.

One thing I have noticed our turn server is no longer using as much data. Talk or the clients seem to decide never to send the feeds through it any more. I have run all the tests on it, I can force using it using the relay command in the browser to test the server connection from within Talk. Then I see the expected data usage, we monitor the historic use in Zabbix. That is how I can see this data usage drop off since June.

We run Coturn on port 443 without ssl, we have the struktur AG / Spreed HPB.

listening-port=443
fingerprint
use-auth-secret
static-auth-secret=oursecrethere
realm=turn.example.com
total-quota=0
bps-capacity=0
stale-nonce
no-multicast-peers

Considering we are doing all the right things (?). I am at a loss and hope someone here has some insight in to this, any thanks in advance.

Edit: I have disabled ipv6 on the turnserver also, could be a issue(?) we have no AAAA record though.

ralfi · September 6, 2022, 9:48am

You really need coturn AND high performance server? Are you behind a restrictive proxy / firewall?

mvv_vmd · September 6, 2022, 9:55am

Yes, we talked with the HPB supplier about this, I was expecting them to do turn/stun also but they don’t. These are separate required services, many of our users are behind restrictive proxies / firewalls.

wwe · September 6, 2022, 8:00pm

both components address different topics:

TURN server provides relay in case there is no direct connectivity between endpoints
HPB performs “selective forwarding” of the signal, only one media stream is send and received by each client (in opposite to default “mesh” sending, which obviously doesn’t scale well)

https://www.spreed.eu/de/contact-nextcloud-talk-high-performance-backend/

for this reason it’s not wrong to use them both (TURN almost always required) - OPs statement the issue happens in calls or calls with 4+ participants points into direction HPB might be useful…

In general WebRTC works in a way clients use peer-to-peer connections whenever possible. for this reason if multiple clients can directly reach each other it’s expected they communicate directly rather through TURN.

are you aware what happened then? any infrastructure change, maybe other user habits (e.g. more office days after home-office phase)…

do you have HPB accessible from the internet? this might be the reason why client don’t need/use the TURN server.

@mvv_vmd you issue is definitely hard to troubleshoot. If my assumptions of your architecture are right I would expect the bottleneck is on the HPB system - either compute or network shortage… but you never know for sure without really hard troubleshooting on every end (e.g. with browser console, fiddler and maybe HPB debug logs) and isolating the the root cause - if the client stops sending data or server stops receiving the stream, or the server stops re-sending the stream to other participants…

mvv_vmd · September 7, 2022, 7:23am

Thanks, other then it being summer we have no explanation for the network traffic on our turn server too get so low. I would setup a test call from my mobile carrier network and the office network with my phone and laptop and go trough the turn server. This does not happen anymore, the call does work so hard to tell if this is a good or bad thing.

I’ll check with the HPB host if they see any issue on their end.

I found I do have this issue:

github.com/coturn/coturn

coturn stun configuration works in firefox but fails in chrome

opened 01:42PM - 02 Sep 22 UTC

RavikumarTulugu

question

I am testing the webrtc sample application [https://webrtc.github.io/samples/s…rc/content/peerconnection/trickle-ice/](url) The application works fine on firefox with out any issue but on chrome it throws following icecandidate error. Any special setting needs to be enabled to work with chrome ? url: stun:www.xyz.com:3478 address: 0.0.0.x port: 59201 host_candidate: 0.0.0.x:59201 error_text: STUN binding request timed out. error_code: 701 chrome error logs : Note: errors from onicecandidateerror above are not neccessarily fatal. For example an IPv6 DNS lookup may fail but relay candidates can still be gathered via IPv4. The server stun:www.playme.one:3478 returned an error with code=701: STUN binding request timed out. The server stun:www.playme.one:3478 returned an error with code=701: STUN binding request timed out. Please advise. Thanks & regards,

Switched to stun.nextcloud.com:443 which tests good. Could help, again hard to tell.

ralfi · September 7, 2022, 7:44am

Do you have a Dump based chrome://webrtc-internals/ for diagnostics or Errors from the Browser DevConsole?

For really good working TURN Server you should have a big dedicated bandwidth. I remember my tests with my own COTURN VServer at Hetzner and for more than three or four concurrent user i had related phenomena. But it does not explain why it works before if no hardware changes.

ralfi · September 7, 2022, 7:45am

And of course do you have answers for this question from your network and / or firewall admins?

mvv_vmd · September 7, 2022, 10:21am

I’ve been troubleshooting using chrome://webrtc-internals/ did not know about that one. It is all looking good. I have configured ipv6 on the turn server now, I noticed some errors concerning that during tests. I think I need to try and get a call to break before I can do more. There is a chance I fixed this issue by now though. I’ll update once I find out more.

I am the network/firewall admin, so that is all good. We have plenty of bandwidth, gigabit on all WAN and LAN links, lots too spare.

mvv_vmd · November 23, 2022, 10:02am

Kind of hit a wall with this one, issues with video and/or screen shares freezing are still common. Most of the time for one participant who’s network connection is slightly bad. Jitsi does function for these people.

I would love to hear if someone else has these issues, or had them but found a resolution.

viperwolf89 · February 28, 2024, 11:27am

Hi, this bug is still present in nextcloud version 28.0.2 and Talk version 18.0.3

Personally I am using coturn version Coturn-4.5.2 with SSL/TLS certificate

The calls work very well even connecting via a company proxy.

When you share your screen for the first time everyone sees the sharing correctly…
however, as soon as sharing is disabled, some people continue to see the last frame transmitted as if sharing were frozen.
If the other person starts a new presentation the old frame is not overwritten so the person with the bug cannot see the new share.

The only way is to exit and re-enter the call

If there is something I can do to resolve it can you tell me what?

Thank you

Edit: I tried to manually install Coturn-4.6.2 ‘Gorst’ and the bug is still here.
It seems that some client cant receive the “End share”, sometimes the screen show a transparent windows instead of the default user image (at the end of sharing)