How to use caching reverse proxy with NextCloud, problems with NC setting "cache-control"

I’ve got NextCloud successfully setup at home using the VM image from www.hanssonit.se. I’ve added an external SMB storage to my NAS. Works great! For now, I’m mostly interested in file sharing and syncing.

I already had nginx setup on a VPS proxying connections down a VPN tunnel back to another http service at home. So I decided to do the same with NC at home. It wasn’t tough to add this. Just tested it, works great!

What I really want to do is to have my VPS nginx cache a lot of my file content so I don’t have to keep getting thru the slow upload side of my home’s Internet connection. After the initial cache miss, the files will be in nginx’s cache. I’ve got caching setup in nginx and it is sort of working. The problem I’m having is that NC appears to set headers like:

cache-control: no-store, no-cache, must-revalidate
expires: Thu, 19 Nov 1981 08:52:00 GMT

for some responses I see:

cache-control: private

^^^^ this expires date is an indicator PHP’s session_cache_limiter() is being used. So I assume this is an intentional choice. I can understand the privacy stance on this, but it seems like many NextCloud installs would benefit from an trusted caching reverse-proxy running on a different machine.

I can ignore cache-control and expires headers with nginx config, but I’m concerned something will break, or that if I update the file contents, nginx won’t know to go ask NC is there is updated content.

Also, it looks like doing a direct download of a file, NC uses a downloadStartSecret=xxxxx query param which by default makes nginx think it is different content. Again, I know how to get nginx to ignore this, but I’m concerned about side effects.

Is there support in NC for making static (or mostly static content like my files) more cacheable?

3 Likes

This will be difficult because of the way NC generates CSS and JavaScripts. You could cache image files and such, but it probably won’t help loads.

I tried this before, maybe it was nextcloud 14, with Cloudflare as reverse proxy. It was faster but some things didn’t work right.

You could perhaps look at Varnish as cache. It is used a lot with WordPress and such.

Thanks for the reply.
I’m mostly interested in caching the “big” things, which are my own (and my users) files inside NextCloud. I wouldn’t do it with a service like CloudFlare, but want to do it with a caching proxy I control.

How would Varnish help in this situation? Better caching control than nginx? I have nginx caching working, it’s just that NextCloud is serving up my file content with very restrictive (for my use case) cache-control and expires headers.

Varnish, AFAIK, is meant to cache dynamic sites and can be explicitly configured to handle quirks like the headers.

i think in 2020 chaching make NO sense.

Huh. I’ve thought thru the problem quite a bit and in my exact case with my exact constraints, I thought it was a potential solution. I’ll lay them out here and maybe you’ll have a better idea.

  • This is at home, for family and friends, and for me to tinker with. Not mission critical.
  • Data lives on a NAS at home. I’d like to be able to access it at home via CIFS as well as in NextCloud. So it’s mounted as external storage in NC. It’s going to be a few TB of data.
  • I care a bit about privacy, so I’d prefer the data be at home, or if on a VPS or a 3rd-party storage provider, then encrypted at rest.
  • Home Internet connection is 5mbps up. Going up to 20mbps up is possible but more than the monthly cost of the VPS I already have.
  • I have a VPS already for various things. It has good bandwidth and it’s already acting as a reverse proxy for something else at home.
  • Because it’s at least a couple of TB of data, block storage for that attached to a VPS is expensive.

So, my first solution was to use NextCloud on the VPS, then use an S3 compatible object store service (Wasabi). NextCloud was setup to use Wasabi its storage. This seemed to work, But, I wanted it encrypted on Wasabi so I turned on NC server-wide encryption. That fails. See https://github.com/nextcloud/server/issues/11826

So, then I thought I would host the data at home, but then everyone using this has to get files thru my 5Mbps upload. The usage pattern is likely to be clustered (someone uploads a file, several of us download or view it various times over the next few weeks) and while some files are small, others are 50MB PDFs

The repeated download of files, felt like a perfect fit for a cache. I’m familiar with nginx so setting it up to cache for a really long time was easy for me. I’ve have this all working now, but then I found NextCloud is using PHP’s session_cache_limiter() to prevent caching of even the actual file content it serves. My cache ends up caching only small small bits of the NC UI and small static assets. I bet NC doing it for privacy reasons, but in my case, I control the cache and I posted here to see if there was an easy way to change this in NC and hoping someone else had been thru this.

1 Like

In an effort to make a little progress on this, I did a bit of digging and it looks like all the things I want to cache are being served from /remote.php/webdav/. What I mean is, if I’m browsing files in the files app and eventually select a real file, both the preview and download links eventually hit /remote.php/webdav/path/to/file. In a quick test, if have this in my nginx config, then I’m getting the behavior I expect.

location /remote.php/webdav/ {
    proxy_pass https://x.x.x.x/remote.php/webdav/;

    # Caching
    proxy_cache nc_cache;
    add_header X-Cache-Status $upstream_cache_status; # debug
    proxy_cache_valid 200 90d;
    # ignore these headers to get around NC setting them on all files
    proxy_ignore_headers Cache-Control Expires;
}
1 Like

Do you have statistics how often files loaded? Look in the activities.

I’m just getting NextCloud setup, so I don’t have any useful stats from it. I my guesses about future behavior with NextCloud based on what the current patterns are, a set of files are uploaded and shared several people view or download them, possibly making new a file or two, which several people then view or download. That all happens for a week or two, then moves on to another set. With the occasional re-viewing of those files. Some of these are 50MB files, that takes ~1m30s over 5Mbps

I’m aware that this isn’t absolutely required. NC will still work without the cache, it’s not critical to a business, and I may be over optimizing. I was just surprised I couldn’t do this easily.

I think with VPS on the internet and NAS on home network through VPN i would perhaps install a nextcloud additional on the VPS in the internet (own dns-name). With “Federated Cloud Sharing” you can copy less-secure, often-used data from the NAS to your VPS and share direct on the VPS.

Interesting, I wasn’t aware of federation. Thanks!

If the caching workaround I posted above doesn’t work long term for some reason, I’ll likely do exactly this and then as files become less active migrate longer term things down to the home NC instance.

1 Like

Okay so I needed information on how to do this. My scenario:

I am tasked with setting up a “Dropbox” for a collaborative gamedev unity project. We are spread all across the world, and we have 100’s gb of assets to share with eachother and kept always in sync with constant editing by different people, so git/svn/perforce is no good, people forget to commit too often. A shared Dropbox is needed. But because of the space requirement it would cost too much money. Owncloud, nextcloud. But, a VPS server with 1tb HDD costs too much. Even a VPS with enough ram to not crash apache when nextcloud installing addons costs too much so we have to setup swapfile for nextcloud to work reliably. That is okay. BUT what if I just leave my PC on all day, opt out of carrier-grade NAT, and setup port forwarding and use dynamic DNS? But then I have the issue of the others will suffer poor performance. But what if frequently accessed things could be cached? A reverse proxy?

But another way if you don’t care that nextcloud will be running on a slow VPS and cannot run AI modules well, is that you can use a block/file caching network filesystem and remotely mount your home directory. Just for anyone else that searches for this nextcloud cache scenario (very common scenario) then this idea may be relevant to you. FS-Cache with NFS: 10.3. Using the Cache With NFS Red Hat Enterprise Linux 6 | Red Hat Customer Portal if you run windows at home, just use virtualbox/vmware player with shared folder, and run linux in the VM, and setup firewall and use bridged connection to port forward from your home router to the VM.