Forum performance: still a problem?

Takes forever to open a new thread in Firefox especially when opening links from search. A bit better with Chrome but I don’t use it

1 Like

Any chance you could provide the details from the main start post? We can then try to see if there are patterns with other users who have problems. Thank you! :slight_smile:

Use a few Discourse forums and this one seems fine.

1 Like

Just to write down some numbers: according to Firefox F12 it takes 2.56s for initial page load of help.nextcloud.com in “private” window with “no cache” ticked.

I don’t see any specific resource taking very long time… some missing fonts (at the end).

slowest request is “/” with 633ms total and 543ms of “waiting”

quick check for other requests shows same pattern - almost every request has pretty long “waiting” phase - most of them 100-200ms

judge yourself if this is slow or not. From my feeling the site is slower than others but nothing I can’t live with it

tested from (Switzerland, Swisscom VDSL 100/150MBit/s, Firefox, Windows 10, core i7 11th Gen

2 Likes

I saw this thread title come up in an email, it’s the email that taunts me as I’d like to click on the threads and read them, but I know it takes too long so I don’t. So hey, I tried it again. 3 minutes to load your page, plus nearly 30 seconds to login.

So no, it’s not great - though actually I do think it was longer before maybe 5 mins. I posted it somewhere previously.

This is from NZ on a gig fibre connection. no other site has ever had this problem in the history of my experience even going back to dial up days.

I should add, I also got stuck in Australia when Covid first hit, and the performance was equally as bad there. Which shows how long this problem has been sitting around without a solution as it had literally been years before that that I installed next cloud and I have never known it to be any faster. I truly hope (no pray) that something in this information leads to a solution as really this is incredibly embarrassing.

traceroute to help.nextcloud.com (95.217.53.146), 64 hops max, 52 byte packets
 1  gw.mygateway.co.nz (192.168.43.3)  0.576 ms  0.324 ms  0.338 ms
 2  * * *
 3  * * *
 4  122.56.119.216 (122.56.119.216)  14.620 ms  3.373 ms  3.179 ms
 5  xe11-0-6.sjbr3.global-gateway.net.nz (122.56.119.14)  126.541 ms
    xe-7-0-3.sjbr3.global-gateway.net.nz (203.96.120.134)  127.546 ms
    ae10-10.tkbr12.global-gateway.net.nz (202.50.232.29)  3.517 ms
 6  ae0.pabr5.global-gateway.net.nz (203.96.120.74)  130.829 ms
    xe8-0-9.lebr7.global-gateway.net.nz (122.56.127.22)  132.215 ms
    ae0.pabr5.global-gateway.net.nz (203.96.120.74)  130.912 ms
 7  ae3-10.sjbr3.global-gateway.net.nz (122.56.127.25)  130.129 ms
    palo-b1-link.ip.twelve99.net (62.115.145.204)  135.451 ms  135.305 ms
 8  ae0.pabr5.global-gateway.net.nz (203.96.120.74)  134.373 ms
    nyk-bb2-link.ip.twelve99.net (62.115.122.37)  193.000 ms
    ae0.pabr5.global-gateway.net.nz (203.96.120.74)  135.400 ms
 9  palo-b1-link.ip.twelve99.net (62.115.145.204)  143.853 ms  138.291 ms  140.961 ms
10  s-bb2-link.ip.twelve99.net (62.115.139.172)  283.021 ms
    hbg-bb3-link.ip.twelve99.net (80.91.249.11)  290.575 ms  290.062 ms
11  kbn-bb2-link.ip.twelve99.net (80.91.254.90)  283.339 ms
    kbn-bb1-link.ip.twelve99.net (62.115.134.76)  293.153 ms
    kbn-bb2-link.ip.twelve99.net (80.91.254.90)  285.331 ms
12  hetzner-svc076536-ic365572.ip.twelve99-cust.net (62.115.52.255)  282.089 ms  281.488 ms  301.094 ms
13  hls-b3-link.ip.twelve99.net (62.115.122.35)  290.478 ms  291.951 ms  290.785 ms
14  kbn-b1-link.ip.twelve99.net (62.115.143.7)  294.737 ms
    ex9k2.dc2.hel1.hetzner.com (213.239.224.142)  285.726 ms
    hetzner-svc076536-ic365572.ip.twelve99-cust.net (62.115.52.255)  286.247 ms
15  s5.nextcloud.com (95.216.247.163)  285.666 ms
    kbn-bb2-link.ip.twelve99.net (62.115.138.112)  289.376 ms
    core31.hel1.hetzner.com (213.239.203.205)  286.553 ms
16  * ex9k2.dc2.hel.hetzner.com (213.239.224.146)  295.374 ms
    ex9k2.dc2.hel1.hetzner.com (213.239.224.142)  289.824 ms
17  help.nextcloud.com (95.217.53.146)  282.863 ms
    hls-b3-link.ip.twelve99.net (62.115.122.35)  304.550 ms  301.067 ms
2 Likes

Thank you for asking! Yes, I do still have some performance issues with the forum. Seems like first load with a cold browser cache is the main or only issue. After the initial load the forum behaves more like a single page application (only partially loading pages) and is generally fast. The Firefox developer tools indicates a download of something or other is hanging and blocking progress towards a usable, interactive page. Sometimes it takes more than a minute before I can do anything (I’ll call that my “minutes-long issue” below).

  • location: Seattle, WA USA
  • vpn: no
  • network: gigabit fiber
  • browser: Firefox

Firefox asset download details: OpenSans-Regular.ttf is blocked but requests for /fonts/OpenSans-Regular.ttf?v=0.0.9 succeed (HTTP 200).

Ah, OpenSans-Regular.ttf is blocked by Firefox because it is hosted on a different domain. Maybe that is blocking other assets from loading as well? Check out these errors from the Firefox console:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://nextcloud.com/wp-content/themes/nextcloud-theme/dist/fonts/OpenSansRegular.ttf. (Reason: CORS request did not succeed). Status code: (null).
downloadable font: download failed (font-family: "Open Sans" style:normal weight:400 stretch:100 src index:2): bad URI or cross-site access not allowed source: https://nextcloud.com/wp-content/themes/nextcloud-theme/dist/fonts/OpenSansRegular.ttf 

I can reproduce this CORS failure in the Chromium browser.

https://www.webpagetest.org/result/220911_BiDcQK_879/ shows a timing of <5sec for an interactive page, so doesn’t really count as a good independent repro outside my home network of the minutes-long issue, but it does show several blocking assets and a 4xx HTTP error response code for OpenSansRegular.ttf.

https://www.webpagetest.org/result/220911_BiDcQK_879/3/experiments/ has some really interesting suggestions, including a couple related to the OpenSansRegular.ttf issue above.

See also:

Here’s output of tracepath -n help.nextcloud.com:

 1?: [LOCALHOST]                      pmtu 1492
 1:  192.168.1.1                                           0.361ms
 1:  192.168.1.1                                           0.295ms
 2:  63.231.10.75                                          2.515ms
 3:  63.226.198.81                                         2.426ms
 4:  no reply
 5:  4.69.161.98                                         178.492ms asymm 17
 6:  212.133.6.2                                         179.173ms asymm 17
 7:  213.239.224.38                                      178.922ms asymm 17
 8:  213.239.224.142                                     178.807ms asymm 17
 9:  95.216.247.163                                      178.887ms 
10:  95.217.53.146                                       178.928ms reached
     Resume: pmtu 1492 hops 10 back 10 

mtr -n help.nextcloud.com tries harder, and it shows that “no reply” hop is from a router with IPv4 address 4.68.38.173. It shows 90% packet loss from that router. I’m guessing that router is in Finland based on the similarity of the IP address of the next router. Here’s a snapshot of mtr output:

                                  My traceroute  [v0.93]
myhost (192.168.1.101)                                            2022-09-11T12:33:08-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                  Packets               Pings
 Host                                           Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.1.1                                  0.0%    12    0.3   0.3   0.3   0.3   0.0
 2. 63.231.10.75                                 0.0%    12    4.1   4.0   2.3  18.3   4.5
 3. 63.226.198.81                                0.0%    12    2.5   2.7   2.2   4.5   0.6
 4. 4.68.38.173                                 90.9%    12    3.0   3.0   3.0   3.0   0.0
 5. 4.69.161.98                                  0.0%    12  178.3 180.7 177.9 192.3   4.8
 6. 212.133.6.2                                  0.0%    12  178.8 179.7 178.5 184.2   2.0
 7. 213.239.224.38                               0.0%    12  178.8 179.9 178.6 186.3   2.3
 8. 213.239.224.142                              0.0%    11  178.7 178.7 178.4 179.0   0.2
 9. 95.216.247.163                               0.0%    11  178.6 178.6 178.4 178.9   0.2
10. 95.217.53.146                                0.0%    11  178.8 195.5 178.6 278.0  37.1

Browser details: currently I’m running Firefox 64-bit v104.0 (just whatever is the latest available in the deb/snap repositories for Ubuntu 20.04 and 22.04 LTS) on an Ubuntu desktop or laptop. I can repeat my minutes-long issue in the Chromium browser.

Network details: I’m paying my ISP for 1gbps up and down. Latency is generally very good: 2ms average pings to google dot com. Low jitter with these pings, too. I use a Pi-hole, and I notice it blocks stats.nextcloud.com, so piwik.js is not downloaded by my browser. This doesn’t seem to cause any issues.

I did a couple tests that are maybe more reproducible/actionable.

Some of the Javascript assets are taking a very long time to load, and this seems to be blocking content display and interactivity. Credit to others smarter and faster than me for this cool curl trick.

To repro, first create curl-format.txt, containing:

     time_namelookup:  %{time_namelookup}s\n
        time_connect:  %{time_connect}s\n
     time_appconnect:  %{time_appconnect}s\n
    time_pretransfer:  %{time_pretransfer}s\n
       time_redirect:  %{time_redirect}s\n
  time_starttransfer:  %{time_starttransfer}s\n
                       ----------\n
          time_total:  %{time_total}s\n

Then try downloading some assets. You can get the URLs from below or from the Network tab in your browser’s developer tools. I don’t bother including full output below because all the specific timings are <1sec except time_total. I only include my time_total values below.

# 30sec time_total on my computer. 279KB file size.
curl -w "@curl-format.txt" -o /dev/null -s https://help.nextcloud.com/assets/locales/en-47d629005ed8e32df84f3bb24160792376d2ae05d9748d4d45cefabb291b55dc.js

# 1min 40sec time_total on my computer. 1.1MB file size.
curl -w "@curl-format.txt" -o /dev/null -s https://help.nextcloud.com/assets/vendor-ba7d09ad35bd9aedfe313624355092d06c2c3d59f4ef9a41061b57ae59de9c0c.js

# 4min 47sec time_total on my computer. 2.9MB file size.
curl -w "@curl-format.txt" -o /dev/null -s https://help.nextcloud.com/assets/discourse-446b4a42b2d69071509174374b80bf7c104e9e68e9d6488bb7658c8d63741434.js

Download speeds for these assets average about 10 kibibytes/sec. That seems really slow. Are these served with a CDN? That would surely help. How about compressing them too? I downloaded and ran gzip on the .js files and saw average space savings of 75%.

The Javascript assets do appear to be minimized, so that’s good. Maybe they could/should be combined as well? I thought the best practice was to transpile, combine, minimize, and compress all Javascript assets, do as much of this as possible beforehand, and use a CDN to serve all static assets.

That’s not a problem, the router just doesn’t prioritize to answer pings. It would be a problem if there are similar or higher losses after that. The traceroute the other way round looks similar and goes through Level3 (Lumen):

 3  core23.fsn1.hetzner.com (213.239.229.57)  0.556 ms core24.fsn1.hetzner.com (213.239.229.61)  0.555 ms core23.fsn1.hetzner.com (213.239.229.57)  0.541 ms
 4  213-239-203-118.clients.your-server.de (213.239.203.118)  1.271 ms 213-239-203-114.clients.your-server.de (213.239.203.114)  0.337 ms juniper4.dc1.fsn1.hetzner.com (213.239.203.130)  0.336 ms
 5  ae55.edge5.Berlin1.Level3.net (62.67.27.9)  8.125 ms ae54.edge5.Berlin1.Level3.net (62.67.27.5)  7.819 ms ae55.edge5.Berlin1.Level3.net (62.67.27.9)  8.431 ms
 6  ae2.3603.ear4.Seattle1.level3.net (4.69.202.238)  168.112 ms  167.926 ms  170.345 ms
 7  4.68.38.170 (4.68.38.170)  168.848 ms 4.68.38.174 (4.68.38.174)  169.039 ms  169.010 ms
 8  tukw-dsl-gw75.tukw.qwest.net (63.231.10.75)  169.555 ms  169.394 ms  169.155 ms

I have fast timings as well, even the time_total is below 1s (France, fiber connection Orange/Free).

If that is the main problem, do you see similar speeds for test files on speed.hetzner.com? Normally Hetzner has bough transit for Level3(Lumen): https://www.hetzner.com/unternehmen/rechenzentrum

1 Like

Thank you for your elaborate reply and gathering all this information for us. Where are you geographically located? And are you using any vpn software?

1 Like

Fascinating, thanks!

I’m happy to try downloading test files. Will you give me URLs to try?

Cool, happy to help.

I covered location & vpn in my first post above: Forum performance: still a problem? - #15 by meonkeys

1 Like

sorry it is: https://speed.hetzner.de/

Oooops missed that! Thanks again :slight_smile:

I think the most of the traceroute shows that there is more a problem with the access to Hetzner through the different backbones (ASN) than problems with the server.

To minimize the problem maybe the data of the forum can be minimized.
But i think that is not really possible because of the forum software https://www.discourse.org .
The result is not really bad:

Test of this thread website:
Website Speed Test | Pingdom Tools
help.nextcloud.com...Virginia USA - EC2 - WebPageTest Result

2 Likes

I am going to discuss your suggestion with our sysadmin. Hopefully we can do something about it. Thank you :blue_heart:

1 Like

Thanks. I tried downloading the 100MB.bin test file. Took 24.164201s, so about 4.14 megabytes/sec or 33.1 megabits/sec. That’s plenty fast… about 400x faster for me than downloading the discourse Javascript source.

I saw similar download speeds for the 1GB.bin and 10GB.bin test files, although I stopped those test downloads before they completed.

Oh. Then I take back my opinion from above. Traceroute seems to be not very meaningful after all. :wink: Or maybe there are people who have problems with traceroute and the forum.

My traceroute. Maybe fast because of connection over DE-CIX.

  6    15 ms    14 ms    14 ms  decix-gw.hetzner.com [80.81.192.164]
  7    17 ms    16 ms    17 ms  core9.fra.hetzner.com [213.239.252.18]
  8    37 ms    37 ms    56 ms  core31.hel1.hetzner.com [213.239.224.165]
  9    37 ms    36 ms    36 ms  ex9k2.dc2.hel1.hetzner.com [213.239.224.142]
 10    37 ms    36 ms    36 ms  s5.nextcloud.com [95.216.247.163]
 11    39 ms    37 ms    37 ms  help.nextcloud.com [95.217.53.146]

@meonkeys
Your traceroute seems to use:

63.231.10.75
https://ipinfo.io/AS209
CenturyLink Communications, LLC

4.69.161.98
AS3356 Level 3 Parent, LLC details - IPinfo.io
Level 3 Parent, LLC

213.239.224.38
AS24940 Hetzner Online GmbH details - IPinfo.io
Hetzner Online GmbH

Longer pings seems to begin at Level 3 Parent, LLC.

1 Like

Ok, that’s a bit strange. It’s not impossible, that hetzner routes the traffic differently for their test-downloads. Because if it were the Nextcloud server, it should have an impact on everybody not just for those from overseas. However, most of the regular users probably have a cached version of the static files…

It’s not useless but it has limitations. There is a documentation how you can analyse network issues:
https://docs.hetzner.com/robot/dedicated-server/troubleshooting/network-diagnosis-and-report-to-hetzner/
However, here it doesn’t seem so bad that we actually have packet losses. So there is the routing information, so if we find that users via a certain peering provider have problems, that can be a starting point. Within a network, the providers manage their traffic, meaning they can prioritize certain traffic and/or filter traffic as well.

If you manage to bypass problems by using a VPN (or TOR), it shows that it is rather network related rather than the server.

2 Likes

Great idea. I tried the same curl timings while using a VPN and also while using my phone as a mobile Wi-Fi hotspot. In both cases the download speed was about 10x-15x faster (at least 100 kibibytes/sec). So yeah, that is definitely pointing at something strange while I’m trying to get to the forum from inside my home network. Very odd though… I haven’t noticed this kind of selective slowdown, even, for example, from other Discourse forums hosted overseas from me.

Using either the VPN or mobile hotspot WAN connections the Nextcloud forum loads for me (starting from cold cache) in let’s say a much more reasonable time… <1min. This was easily reproducible.

Here’s tracepath help.nextcloud.com while I’m using a VPN (hosted somewhere east of the pond):

 1?: [LOCALHOST]                      pmtu 1500
 1:  _gateway                                            203.953ms 
 1:  _gateway                                            204.712ms 
 2:  redacted                                            246.456ms 
 3:  redacted                                            204.738ms 
 4:  ae1-6.RT.RAD.HKI.FI.retn.net                        254.188ms asymm  9 
 5:  GW-Hetzner.retn.net                                 213.104ms asymm  9 
 6:  core31.hel1.hetzner.com                             306.527ms asymm  9 
 7:  ex9k2.dc2.hel.hetzner.com                           305.978ms asymm  9 
 8:  s5.nextcloud.com                                    306.854ms 
 9:  help.nextcloud.com                                  306.561ms reached
     Resume: pmtu 1500 hops 9 back 9

and here’s mtr help.nextcloud.com while I’m on a VPN:

                                          Packets               Pings
 Host                                   Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. _gateway                             0.0%    31  163.4 177.5 162.1 283.7  30.4
 2. redacted                             0.0%    31  171.6 180.4 163.5 235.9  17.5
 3. redacted                             0.0%    31  176.3 177.4 161.2 247.1  21.9
 4. ae2-5.RT.RAD.HKI.FI.retn.net         0.0%    31  226.9 213.6 192.8 301.5  30.3
 5. GW-Hetzner.retn.net                  0.0%    30  193.5 205.5 191.8 292.3  24.6
 6. core31.hel1.hetzner.com              0.0%    30  219.4 209.0 193.3 289.3  22.3
 7. ex9k2.dc2.hel1.hetzner.com           0.0%    30  208.9 217.3 193.3 284.1  30.5
 8. s5.nextcloud.com                     3.3%    30  194.4 209.5 192.6 369.5  35.6
 9. help.nextcloud.com                   3.3%    30  194.5 222.8 193.4 359.0  42.0

Side note: I see IPv6 addresses when I use a mobile hotspot. I am excluding those traces because this post is getting pretty long.

Looks like compression of Javascript is enabled: I see Content-encoding: br in the response headers. Brotli compression, apparently. And I also see smaller numbers for “Transferred” vs. “Size” in the Firefox developer tools “Network” tab.

re: caching - I am definitely seeing benefits from cached Javascript assets vs. when I specifically test with a cold browser cache or using curl. It seems to be once a day or so that these Javascript assets are re-downloaded by my browser though (even when I have a login/cookies from the day before). That seems wrong… the Discourse app probably doesn’t change that often so I shouldn’t make new/cache-busting Javascript requests. The cache-related HTTP response headers seem reasonable (fresh for a year, not modified since July, etc.).

I’m not sure if either of these would be expected to make much of a difference, but I’m just curious:

  1. Is a CDN being used?
  2. What about combining Javascript assets?

this forum is consistently the slowest site i use… loading can take 10-30 seconds… no other sites/forums take so long… it feels like the whole database/forum is self hosted on a raspeberry pi!

anyway an mtr/tracert doesnt show it that bad… but it took a good 2 minutes to load tonight…
really you need to change hosting provider whatever the issue! is almost like loading porn back in the 56k modem days trying to load this forum sometimes

*usually there is a noticeable 15-20sec delay that ive just learnt to live with)
image