Duplicated entries in oc_filecache table, one without last '/' (and causing bad behaviour)

Sergio_Garcia_Galan · June 27, 2017, 12:14pm

Hello

After upgrading from OC 8–>9–>10 to NextCloud 12.0.0 with apparently no problem, I’m facing a strange behaviour with filecache table.

Facts:

Clients complain about they cannot delete files.
Affected files appear duplicated in the web view, and they are named with full greyed path (as if they came from different shares, but in fact they are not shared or at most they are but only once by the owner)
In the web view, non deletable files seem to be created by the complaining user a while ago (false) in the Details section, but modified days, weeks or months before (really true) in the time stamp label placed in-line at the rightmost part of the file entry.
oc_filecache entries related to these files, are duplicated.
But both entries differ from each other in one thing:
- The right one has:
  - path = files/username/actual_path_to_file/filename
and the wrong name has:
- path = files/username/actual_path_to_filefilename
  (Note the missing ‘/’ before filename)
Bad entries can be pointing to files into ‘files’, ‘files_versions’ of ‘files_trashbin’, no matter.
The issue is started in the moment any client app sync process starts, but it seems to be random, not all sync processes start the issue.

Small tool to monitor:

I built a simple SQL sentence to detect all of these wrong filecache entries:

SELECT * FROM oc_filecache where ( path not like Concat('%/',name) ) AND path <> name

The second ‘AND’ part is added to exclude entries like “files” itself and other single filename paths that don’t seem to be affected.

I noticed that the list of bad entries grows to infinity without control. After few days working, we can have more than 8.000 bad entries.

Fix trials:

to set PHP memory limit to 256M (from 128M) after reading about some similar problems, but still didn’t see any positive effect on the behaviour.
files:cleanup (It does nothing, no orphan file entry is detected by occ, having actually more than 8000 wrong ones in the table)
files:scan username (or --all)
This also does nothing on filecache table, but has an annoying effect: affected files are updated at client side every second after scan, and create conflict copies. Stoppable only by manual and urgent deletion of the bad entries, using SQL.

Up to now, the only thing that works, but for a small break is:

DELETE FROM oc_filecache WHERE ( path not like Concat ('%/', name) AND path <> name

This can give admin a break for a coffee or for a rest, but this is a very ugly solution…

I think there’s any issue which creates the wrong (without the last ‘/’) entries into oc_filecache.

Server is an Ubuntu Server 16.04 LTS using all its packages (PHP 7.0 and so on) running into a Proxmox VM environment using up to 8GB of RAM (for Nextcloud VM) and no other service is installed on this VM.

tflidd · June 27, 2017, 10:16pm

This doesn’t look good. How did you configure NC, with mod_php and apache2?
@nickvergessen @icewind

Sergio_Garcia_Galan · June 28, 2017, 9:31am

Hi,
Thanks for your reply.

This is my apache configuration:

Server version: Apache/2.4.18 (Ubuntu)
Server built: 2017-06-26T11:58:04

Loaded Modules:
core_module (static)
so_module (static)
watchdog_module (static)
http_module (static)
log_config_module (static)
logio_module (static)
version_module (static)
unixd_module (static)
access_compat_module (shared)
alias_module (shared)
auth_basic_module (shared)
authn_core_module (shared)
authn_file_module (shared)
authz_core_module (shared)
authz_host_module (shared)
authz_user_module (shared)
autoindex_module (shared)
deflate_module (shared)
dir_module (shared)
env_module (shared)
filter_module (shared)
gnutls_module (shared)
mime_module (shared)
mpm_prefork_module (shared)
negotiation_module (shared)
php7_module (shared)
setenvif_module (shared)
socache_shmcb_module (shared)
ssl_module (shared)
status_module (shared)

I don’t know if you can take something useful from it…
However, I’m giving some more info in a while, I have news.

Sergio_Garcia_Galan · June 28, 2017, 3:10pm

Here’s my news…
Yesterday and today I spend some hours monitoring the oc_filecache table and tidying it up occasionally.

During a low activity period, I got to DELETE all buggy entries and rescan all users (files:scan --all).

A lot of new buggy entries appeared after minutes, but I deleted them again and rescanned only the affected users. I kept on doing this for about 30 minutes and 3 hours later the server hadn’t got any buggy entry!!!
Then went to sleep .

Today in the morning I re-checked and found only 98 buggy entries (fine!), so did again: DELETE and ’files:scan username1 username2 ...' (they were only 3 or 4 affected).
I am concluding that deletion or scan by themselves do not arrange things. They both must be run, and in that order.

Now it’s 6 hours later, and only 3 buggy entries from a single user seem to appear. Time for a small cleaning.

Don’t really know if they will appear again, but at least the rate is now far away from the first days (also user calls ). I’ll tell you tomorrow

I cannot be sure, but maybe I can imagine a possible cause:

Some Clients were not syncing well just before and after the OC->NC upgrade (we have some bad network connections at some points at the end of the known world :D) and they kept persistent buggy or old-fashioned (from ownCloud?) files metadata.
When they try to sync, they drop this stuff into the server, and it gets validated because, however, the files exist (I can figure many ways out on how the process checks files, avoids the lacking ‘/’ and gives both entries as valid).
These entries return to the client as bad (double) entries when they resynchronize. So they stop the sync loop and restart over and over again, keeping all bad info.

As I clean it (delete + scan), the information given to the client side is clean, so the get to complete a sync loop and they refresh all their metadata.

((Doubt: Do clients store metadata about files or do the do their work over the files themselves on the fly every time?))

Sergio_Garcia_Galan · July 4, 2017, 7:00am

Well, I have bad news…
Bug is still annoying. I’ve found that not only the ‘/’ is lacking, but also occasionally some extra characters at the beginning of the filename (in ‘path’ field of oc_filechache).

There you can see a trashbin file entries, one belonging to the owner of the file, ant the second and third one belonging to the shared user who deleted the item. The second is duplicated, but the copy has some lacking characters at the filename’s beginning (exactly, ‘/Pl’ are missing).

The view on the web (Deleted Items, for this case) is as follows:

I can figure out that the ‘undefined’ path is the lacking table entry.

This is being produced only for a reduced set of users, but they using owncloud client as well as nextcloud client applications.

Any hint or patch will be appreciated.

Regards.

Sergio_Garcia_Galan · July 4, 2017, 1:48pm

Some new feedback for this, if someone has any hint to solve or track the issue, it will be appreciated:

The problem arises when any user renames any folder, in their synced folders or in the web application. Then all things down inside the renamed folder get their buggy duplicated entries in oc_filecache (but with the new name for both entries, not a shadow of the old name), one correct and the other lacking some characters.

Regards.

tflidd · July 4, 2017, 6:47pm

The developers didn’t reply. Please open an issue on https://github.com/nextcloud/server/issues, such topics can easier be handled and tracked there. Without deeper knowledge of the code, it will be difficult to tell you what is wrong.

Sergio_Garcia_Galan · July 5, 2017, 11:57am

OK. I will try to harvest some more information and try to precisely reproduce the problem, and then open an issue.
Thanks!

Shak · November 9, 2017, 1:41pm

I’m also getting the exact same problem … any news? (NextCloud Server 12.0.0)
@Sergio_Garcia_Galan did you open a ticket for that?

Sergio_Garcia_Galan · June 18, 2018, 11:04am

Hello
After months having a “nightly cleanup script” working every day to clean up these odd entries, we got to update to Nextcloud 13.0.4 (directly from 12.0.0) and we are having no issues since then related to this topic . So I think that 12.0.0. had a small bug evincing under intense connections (not under normal personal use).
Maybe I must say that our faulty server was the one at our company, with more than 100 ‘human-users’ (geographically dispersed) and about 50 ‘project-users’ sharing a lot of folders among them. Meanwhile, my personal server (2 ~ 3 users) using the same version, wasn’t having any problem at all.
So if anyone is experiencing this issue today, just upgrade.
Regards!!!

Sergio_Garcia_Galan · June 18, 2018, 1:43pm

@Shak . Sorry for not answering before. Have a look to my last entry in this thread.