Nextcloud on WSL2 with Windows 10: Losing Network Connectivity

I’ve installed Nextcloud on WSL2 in Windows 10. Running on top of the Ubuntu 20.04 LTS distro with a LAMP stack (Apache, MariaDB, PHP 7.something).

Everything works just fine, in all regards, and my server passes all the checks in the “administration” settings.

But, after about a week of running just fine, network connectivity just drops to the WSL2 instance.

Requests don’t get through to the server. And, as a security certificate wasn’t renewed by Certbot one time, requests don’t appear to be getting out of WSL either.

When I’ve looked at the server itself, everything seems fine. It’s just that WSL is arbitrarily losing its network connection after a while.

If I manually reset it all - turn it all off and back on again - then it comes back up and runs just fine again… for about a week, and then it loses the connectivity again. It keeps doing this and it’s very annoying, as Nextcloud itself is working brilliantly.

It’s like Windows is putting WSL to sleep, or paging it out, or forcibly cutting the network connection, or something.

But it runs just fine for about a week, then it stops responding. Indeed, I’ve got a cronjob running for Certbot - to renew the SSL certificate - and that also failed to run (not sure if this is because the server also couldn’t connect outwards either, or “cron” didn’t run because WSL is paged out or ?!?).

Has anyone had issues like this trying to run Nextcloud on WSL2 with Windows 10? And how the hell do you fix it?

I think the problem is more to do with WSL2 and how it handles networking - as Ubuntu and Nextcloud and certbot all run just fine, without issue, for many days until the connection loss happens. If I restart it all, then it comes back up just fine and, again, runs for many days without issue, until it happens again.

A server that keeps losing its Internet connection is a bit useless, and I’ve set this up for someone else - hence the need to run it on Windows, due to other Windows software being needed - so I can’t easily just keep manually resetting it every time.

WSL is intended to run user applications, not system services. You should not run Nextcloud in WSL even for testing purposes.

If you need to run it in Windows, use a virtual machine instead.

Thanks.

I had suspected this might be the case - that WSL2 is not “server grade” quality - but there were articles online about how to install and run Nextcloud on WSL, and I guess I took that as confirmation that it was possible. But, well, from actually trying it, clearly not.

(Plus, WSL2 - unlike WSL1, which operates more like “WINE in reverse” - is apparently a virtual machine that runs a genuine Linux kernel within it. Using Hyper-V and CPU virtualisation and all that. So I thought that WSL2 was effectively a virtual machine, of sorts.)

Any recommendations for the virtual machine?

I’ve used VirtualBox for such things before. Is that a good choice?

While it may be possible to run NC in WSL, it’s also unsupportable and not meant to run services. The people writing guides for it are misguided.

VirtualBox will work, and so will Hyper-V. Take your pick.

I will go with VirtualBox, I feel.

But for the sake of completeness, if someone else is reading this thread, then I did actually discover the underlying cause of the problems with WSL2. And it probably only further emphasises why it’s NOT the choice to make for this sort of thing, as it’d affect running ANY long-term services within it and not just Nextcloud.

The Linux kernel likes to use unused RAM as a file cache, to speed up disk operations (as does Windows too).

But the problem is that both WSL2 and Windows share all your RAM by default, so they’re both trying to use up unused RAM in this way - except that, as it currently stands, Windows can’t free up the file cache used by WSL2. It sees this as used memory.

Perhaps you can see where this is going. Basically, as time goes on, the WSL2 instance uses more and more RAM as its file cache - as it thinks, from its perspective, that it has access to all the system’s RAM - and eventually having both Windows and the Linux kernel within WSL2 both competing for this unused RAM, they eventually completely fill it up. Twice over.

So both Windows and the Linux instance are using more RAM than the system has, and everything gets paged out.

So, it’s not explicitly that the network connection gets cut, but that the whole WSL2 instance eventually gets paged out of RAM (as Windows is the host system and gets priority), and paging it all back in again - remembering it is now the size of your entire RAM - takes way too long that everything times out.

Thus, to make any long-running app or service work in WSL2, you MUST create a “.wslconfig” file for WSL2 that restricts memory use. Honestly, this should happen by default, but it doesn’t and you have to manually set this. Then set the maximum memory for the WSL2 instance to something that’ll not - when combined with Windows’ own memory use - end up inevitably over-filling your RAM, forcing large parts of both Windows and the WSL2 instance to end up effectively permanently paged out.

But, yeah, the fact that this is even a thing with WSL2 right now - hopefully, they’ll eventually patch this - does just emphasise that, no, it’s not designed nor ready for anything but development.

As in its default state, right now, it is, in fact, inevitable that this will eventually always happen - as the Linux kernel will eventually attempt to use all your unused RAM as a file cache. Nextcloud reaches this point faster than most, as it’s dealing with a lot of files, but it’s an inherent design flaw. Anything running in WSL2 that touches files and runs for a long time, will eventually reach this point. Because the WSL2 instance sees all your system RAM as “available” and, in its own little virtual machine, has no idea that Windows is also there, needing that same RAM. Between them - Windows and the WSL2 instance - they will eventually over-subscribe your RAM and things will be paged out.

The workaround is to manually restrict WSL2 with the “.wslconfig” file (though that’s an inexact science, as you’re just guessing how much is safe to allocate for both to co-exist without over-filling your RAM - and if the Windows side of things uses too much RAM then it can still happen).

But, basically, this is a bug - an inherent design flaw - in WSL2 and, yes, confirms that, currently, it shouldn’t be used in production for, frankly, anything.

So, yeah, I’m going with VirtualBox (which inherently has a limit on RAM use for virtual machines, when you create them).

But I thought, in case anyone’s reading this forum, searching for “WSL2”, as they might have also been deceived by articles online telling you how to install Nextcloud on WSL2, this is the low-down on why that’s a bad idea and you really shouldn’t be doing that.