Suspiciously bloated MariaDB database

I have a Nextcloud instance running on an Amazon EC2 instance, with files hosted on an Amazon S3 bucket. There are 2 users on the instance, for roughly 3 millions files (260 GB).

I noticed accidentally today that the MariaDB database now weighs 3.6 GB. Last I checked (maybe a month ago), it was around 50 MB and not much has changed since then. top says MariaDB is now eating up 15 % of the RAM and 10-12% CPU (for context : Amazon EC2 charges extra for CPU load above 20 % – this situation is currently costing me money).

I didn’t find much info in the doc regarding database cleanup. What could be the problem here and how could I fix it ?

Thanks.

1 Like

5 h later, the DB weighs 3.8 GB even though I ran occ files:cleanup.

Maybe as a first step, check which tables are getting that big: https://mariadb.com/kb/en/information-schema-tables-table/#view-tables-in-order-of-size

Thanks. Here is the result:

+--------------------+--------------------------------------------+-----------+
| DB                 | Table                                      | Size (MB) |
+--------------------+--------------------------------------------+-----------+
| nextcloud          | oc_filecache                               |   2443.95 |
| nextcloud          | oc_systemtag_object_mapping                |   1352.52 |
| nextcloud          | oc_activity                                |     37.63 |
| nextcloud          | oc_calendarobjects                         |     17.03 |
| nextcloud          | oc_calendarobjects_props                   |     15.11 |
| nextcloud          | oc_duplicatefinder_finfo                   |      7.02 |
| nextcloud          | oc_filecache_extended                      |      5.38 |
| nextcloud          | oc_mail_messages                           |      4.33 |
| nextcloud          | oc_calendarchanges                         |      3.03 |
| nextcloud          | oc_cards_properties                        |      2.64 |
| nextcloud          | oc_files_antivirus                         |      2.52 |
| nextcloud          | oc_mail_recipients                         |      2.23 |
| nextcloud          | oc_cards                                   |      1.67 |
| nextcloud          | oc_news_items                              |      1.63 |
| nextcloud          | oc_file_metadata                           |      1.52 |
| nextcloud          | oc_authtoken                               |      1.08 |

Take a look which entries get added to the filecache table. From the sound of it there should be constantly new rows.
Also can you please add information about server and configuration?

Server is Ubuntu 22.04 Amazon AWS image, with a basic LAMP server and php8.1-fpm.

After running sudo mysqlcheck -o --all-databases (optimize all DB), I still get a 3.3 GB DB:

+--------------------+--------------------------------------------+-----------+
| DB                 | Table                                      | Size (MB) |
+--------------------+--------------------------------------------+-----------+
| nextcloud          | oc_filecache                               |   2386.22 |
| nextcloud          | oc_systemtag_object_mapping                |    877.98 |
| nextcloud          | oc_activity                                |     43.63 |
| nextcloud          | oc_calendarobjects_props                   |      7.16 |
| nextcloud          | oc_filecache_extended                      |      5.55 |
| nextcloud          | oc_calendarobjects                         |      4.31 |
| nextcloud          | oc_cards                                   |      1.66 |
| nextcloud          | oc_file_metadata                           |      1.52 |
| nextcloud          | oc_files_antivirus                         |      1.52 |
| nextcloud          | oc_calendarchanges                         |      1.25 |
| nextcloud          | oc_cards_properties                        |      1.17 |

It seems to be related to How does external storage work (local caching, trash, storage space)?

From what I understand, there is no garbage collection when hosting files on S3. Maybe a manual SQL command executed from Cron tasks could take care of it ?

But as far as I understood, the DB is growing, even with no files changed/added or something, right? So try to find out what entries get added to the DB to get a clue what’s happening.
Which NC version are you using?

Ok, I found this bug report which seems to be quite similar : File cache table excessively large (and does not shrink after data removal) / Nextcloud should defragment file cache table · Issue #7312 · nextcloud/server · GitHub

But as far as I understood, the DB is growing, even with no files changed/added or something, right? So try to find out what entries get added to the DB to get a clue what’s happening.

The DB is growing indeed, but I’m not sure what I should fetch from MySQL to see the content of the table.

Which NC version are you using?

25.0.3 aka latest stable.

You could to something like

SELECT * FROM oc_filecache ORDER BY fileid DESC LIMIT 100

And see if you get a clue if the same file(s) are added over and over again.
Also intresting in your case is the system tag mapping table. Did you add something that automatically tags files?

Thanks, much appreciated.

Did you add something that automatically tags files?

I have a file drop and a workflow that adds a “client” tag to files dropped by people who don’t have an account on the NC instance. Then a retention rule automatically deletes files tagged with “client” 365 days after upload. But I have like 8 of those files so far, so it doesn’t make sense to have such a bloated table.

The oc_filecache table seems to be mostly populated with file previews and versions of files, noticeably .obsidian/workspace.json which is the config file of Obsidian. There are quite a lot of versions of this one.

So maybe we can figure something out this way:

SELECT COUNT(*) as "count", path FROM oc_filecache GROUP BY path ORDER BY count DESC LIMIT 100

I have found that I have hundreds of thousands of DB entries for previews in nextcloud/data/appdata_xxxx/preview/* but the folder don’t actually exist on the disk (not sure where the actual previews I see in the UI are stored then…). At some point, I had the Preview Generator app installed, that might be related. For now, I’m running

DELETE FROM oc_filecache 
WHERE path LIKE "appdata_xxxxxx/preview/%" 
ORDER BY fileid DESC LIMIT 10000;

Making progress but probably not in the right direction.

So, I had a total 2.652 millions previews in the oc_filecache table in DB. Since I didn’t find them on the server disk, I deleted them, resulting in a DB weighting 992 MB instead of 3.6 GB.

But, I found out later that these previews, unfound from the server disk, are actually hosted on the S3 bucket. So cleaning up the DB just orphaned them, and since S3 are not filesystems, there is no way to delete them now because the link is lost.

I have to restore my DB backup. Back to square one and 3.8 GB.

Ok, I think I got the hang of this. Let’s write it down so it can help others.

Intro : S3 buckets as a primary storage for Nextcloud

Simple Storage Service (S3) comes from Amazon Web Services (but is now supported by a variety of other providers) and is used to host massive amounts of files, with the ability to select different access tiers for scalability and cost management. It can easily be encrypted server-side, replicated on the the cloud and versioned for safety, and switching between S3 buckets redundancies is as easy as changing an ID/Secret string in config.php.

All in all, they are pretty great because they enable cheap hosting for data, while keeping the data away from the front-end server (hosting Linux/Apache/Nextcloud), provided that SSD space is more costly. This allows me to keep a 30 GB volume for the server part, which gets fully backed-up once a day as a full snapshot (OS, programs, Apache and Nextcloud config and database at once), and these snapshots can be used to spawn a new computing instance or to roll back the running instance in a matter of seconds.

But…

S3 buckets are not filesystems. And that’s where it gets messy with Nextcloud.

Using sudo -u www-data php /var/www/nextcloud/occ files:cleanup or sudo -u www-data php /var/www/nextcloud/occ files:scan-app-data has no effect on external containers (that is, anything not hosted in the local /var/www/nextcloud). Meaning:

  1. there is no garbage collection whatsoever on S3 buckets :
    • files versions are supposed to be kept as long as the used storage space of the user is < 50 %. Using S3 buckets, given that you pay for what you use, you might be tempted to keep ressources unlimited, but then versions are kept forever.
    • Deleting files doesn’t delete either the database entries for their thumbnails nor the actual thumbnails files on the S3 bucket (issue reported in 2020).
  2. you can’t manage the garbage collection yourself because:
    • as mentionned above, S3 is not a filesystem : it’s a collection of files named like urn:oid:xxxx where xxxx is an incremental number known only by your NextCloud database.
    • you can’t empty your DB manually either because the only link between files living on your devices and their S3 object ID is in the Nextcloud database and would be lost foverever.
    • the oc_filecache database table actually references the file thumbnails path as nextcloud/data/appdata_xxxx/preview/*, which is super misleading because this folder exists neither on the local /var/www/nextcloud directory nor on the S3 bucket, and because the thumbnails are very much hosted on the S3 bucket (as pointed by the storage column of the oc_filecache table).
  3. emptying the trashbin doesn’t even work on S3 buckets : issue reported in 2020.

Solving my problem

Problem 1 : unoptimized DB

For some reason, NextCloud doesn’t customarily optimize databases. Running sudo mysqlcheck -o --all-databases, my DB shrank from 3.8 GB to 2.5 GB but it took ages.

To run it every day at 1am, add a cron job as root user : sudo crontab -e and add the following line : 0 1 * * * mysqlcheck -o --all-databases.

Problem 2 : no garbage collection after deleting apps (plugins)

I found tables for apps I had removed a couple of months ago, like Duplicate finder and Nextcloud Mail, even though I deleted all my accounts in Nextcloud Mail before removing the app. DROP TABLE oc_useless_table; on all useless tables made me recover 20 MB.

Problem 3 : no garbage collection for removed external storages

I added my website server as FTP external storage to see if that was useful, but it was slower than mounting FTP and DAV directly on my OS filebrowser, so I removed it. But all the indexed files and their versions were still in DB.

  1. List known storages : SELECT * from oc_storages;
+------------+-------------------------- ---------+-----------+--------------+
| numeric_id | id                                 | available | last_checked |
+------------+------------------------------------+-----------+--------------+
|          4 | object::store:amazon::xxxx         |         1 |         NULL |
|          6 | object::user:xxx                   |         1 |         NULL |
|          7 | local::/var/www/nextcloud/data/    |         1 |         NULL |
|          8 | home::xxx                          |         1 |         NULL |
|         10 | home::xxx                          |         1 |         NULL |
|         12 | ftp::user@server.com/              |         1 |   1675297283 |
+------------+------------------------------------+-----------+--------------+

Note that removed storages can still have a 1 in the available columns, so I don’t know what it stores. Find out the numeric_id of the storage you don’t use anymore, here it’s 12.

  1. Clear the file cache of files from this removed storage : DELETE FROM oc_filecache WHERE storage=12;

  2. Run SQL optimize on oc_filecache : OPTIMIZE TABLE oc_filecache;

Problem 4 : tags are poorly implemented

I use only 1 tag on 6 files over the whole 2.6 millions files hosted on my NextCloud instance. The only reason I’m even using that tag is for file retention rules.

This seems to be enough to create a full oc_systemtag_object_mapping table containing 2.6 millions entries and weighting close to a GB for no reason.

I removed the file retention rule (will manage that by hand), the file retention app, the collaborative tag app, and deleted the whole offending table : DROP TABLE oc_systemtag_object_mapping;.

Nextcloud will re-create a new one or prevent it from being deleted, so we need again to optimize it after re-creation : OPTIMIZE TABLE oc_systemtag_object_mapping;

Database is now down to 1.6 GB.

Problem 5 : versions are kept forever

Because the file version garbage collection expects an account size limit, it will not happen here.

In /var/www/nextcloud/config/config.php, I added 'versions_retention_obligation' => '14,30', to have versions retained at least 14 days but at most 30 days, as it’s said on the doc.

Then run sudo -u www-data php /var/www/nextcloud/occ versions:expire. Barely changes anything to the database size though.

Problem 6 : orphaned thumbnails are not deleted

This guy wrote a garbage collection script that properly links S3 IDs with files to check for orphaned previews (whose parent file has been removed):

  1. Get it : wget https://raw.githubusercontent.com/charleypaulus/nextcloudtools/main/delete-old-previews.sh
  2. Install rclone : sudo apt install rclone on Ubuntu
  3. Backup your Nextcloud database,
  4. Run it : sudo /bin/bash delete-old-previews.sh
  5. Note the number of orphaned thumbnails and escape with Ctrl+D because the S3 bucket is hardcoded in the script and it will fail.

I have 655 orphaned files in there, so I will not bother adapting the script for now.

Conclusion

I’m not impressed with Nextcloud scalability. My instance is 4 months old, has 2 users and 4 devices connected, and still it gathered a lot of dust already. From what I see here, the software can’t scale and I don’t want to be the sysadmin of an organization where users use the full range of features in the software.

What worries me is that most issues I faced with S3 have been reported already for 2 years. They can be solved by a bunch of SQL scripts at the right places. Yet, Nextcloud is developing niche features like Talk (when we already have Jitsi and Matrix) or Mastodon-like stuff instead of fixing basic file handling on external storage. And by “basic file handling” I mean deleting DB entries and caches when a file is deleted on a filesystem, I’m not asking for a neutrino vacuum cleaner.

But more importantly, none of these issues are properly documented on the user doc, so everything looks ok at install time, bad surprises arise later. It says that S3 is supported, so you think it’s properly supported, and it’s not the case.

I decreased here the size of the DB from 3.8 to 1.6 GB (which is still a lot), and I brought the CPU consumption of MariaDB from 15 % back to 0.5-1 %. This took me 3 days and the help of @SysKeeper (thanks a lot).

6 Likes

Hi Aurélien,

Since then I made this pull request that will provide a built-in occ command (preview:delete) to delete previews (with option to delete all previews, orphan previews only, or previews of files of a given mimetype).

2 Likes

Great news !

I’m reopening this one year later because the problem re-appeared badly enough to justify investing resources into a fix.

TL;DR the S3 bucket grew up to 400 GB while Nextcloud was reporting merely 312 GB of use. Then the database grew to 1.9 GB, where 1.7 GB was the table oc_filecache only.

I coded a Python script to list the number of S3 objects:

import toml
import boto3

config = toml.load("config.toml")

# Open server sessions
session = boto3.Session(aws_access_key_id = config["s3"]["key"],
                        aws_secret_access_key = config["s3"]["secret"],
                        region_name = config["s3"]["region"])
s3 = session.resource('s3')
mybucket = s3.Bucket(config["s3"]["bucket"])
s3_files = [object.key for object in mybucket.objects.all()]
print(f"Detected {len(s3_files)} files from S3 bucket")

The bucket reported around 127k objects.

But then SELECT * FROM oc_filecache` returned 2.7 millions rows ?!?

Upon further investigation, I found that the DB had lots of empty folders named like appdata_XXXXXXXXXXX/preview/b/4/f/1 hosted on the S3 bucket. The following query cleared 2545737 rows, reducing the DB size from 1.9 GB to 288 MB:

DELETE FROM `oc_filecache` WHERE `size` = 0 AND `mimetype` = 2 AND `storage` = 4

where:

  • mimetype = 2 targets folders (see with SELECT * from oc_mimetypes),
  • size = 0 targets anything empty (or declared as such),
  • storage = 4 targets the S3 bucket (see yours with SELECT * from oc_storages)

After this, I tried to match DB vs. bucket IDs to find out what was actually on the S3 bucket. First, using SELECT fileid FROM oc_filecache WHERE mimetype > 2 (discarding folders and httpd mimetype — whatever it is) :

Detected 125037 files from S3 bucket
Detected 124649 files from Nextcloud database
Detected 1358 orphaned files from S3 bucket (1.09 %) [no id match in DB]
Detected 970 orphaned files from DB (0.78 %) [no id match in S3]

Then again discarding only folders (SELECT fileid FROM oc_filecache WHERE mimetype != 2)

Detected 125037 files from S3 bucket
Detected 124649 files from Nextcloud database
Detected 1358 orphaned files from S3 bucket (1.09 %) [no id match in DB]
Detected 970 orphaned files from DB (0.78 %) [no id match in S3]

Then again not discarding anything (SELECT fileid FROM oc_filecache):

Detected 125037 files from S3 bucket
Detected 188312 files from Nextcloud database
Detected 1186 orphaned files from S3 bucket (0.95 %)  [no id match in DB]
Detected 64461 orphaned files from DB (34.23 %) [no id match in S3]

So it would seem that:

  • the S3 bucket stores 174 folders over 63663 ??? Whatever there is to store… and why some but not all ???
  • some folders marked empty in DB, then removed as such, orphaned S3 objects, meaning they were probably not empty or they were not folders.

It’s getting more and more confusing.

Tried again on a backup. The following Python script:

import os
import toml
import mysql.connector
import boto3
import json

config = toml.load("config.toml")

# Open server sessions
session = boto3.Session(aws_access_key_id = config["s3"]["key"],
                        aws_secret_access_key = config["s3"]["secret"],
                        region_name = config["s3"]["region"])
s3 = session.resource('s3')
mybucket = s3.Bucket(config["s3"]["bucket"])

mydb = mysql.connector.connect(
  host = config["mysql"]["dbhost"],
  user = config["mysql"]["dbuser"],
  password = config["mysql"]["dbpassword"],
  database = config["mysql"]["dbname"]
)

def sql_query(query, db):
  db.reconnect()
  cursor = db.cursor()
  cursor.execute(query)
  return {id[0] for id in cursor.fetchall()}

# List previews (id, names, and parent image id) attached to a full image that doesn't exist anymore
# From https://github.com/charleypaulus/nextcloudtools/blob/main/delete-old-previews.sh
previews = sql_query("select tP.fileid, tP.name, tP.fullImgId from (select fileid, name, convert(substring_index(substring_index(path, '/', -2), '/', 1), UNSIGNED) as fullImgId from oc_filecache where path like '%/preview/%.jpg') as tP left join (select fileid from oc_filecache) as tA on tP.fullImgId = tA.fileid where tA.fileid is NULL order by tP.fullImgId", mydb)
print(f"Detected {len(previews)} orphaned file previews from Nextcloud database")

s3_files = {int(object.key.split(":")[-1]) for object in mybucket.objects.all()}
files_n_folders = sql_query("select fileid from oc_filecache", mydb)
just_files = sql_query("select fileid from oc_filecache where mimetype > 2", mydb)
just_folders = files_n_folders.difference(just_files)
empty_folders = sql_query("select fileid from oc_filecache where mimetype = 2 and size = 0", mydb)

print(f"Detected {len(s3_files)} files from S3 bucket")
print(f"Detected {len(files_n_folders)} entries (files + folders) oc_filecache")
print(f"Detected {len(just_files)} files from oc_filecache")
print(f"Detected {len(empty_folders)} empty folders from oc_files")

s3_orphans = s3_files.difference(files_n_folders)
print(f"Detected {len(s3_orphans)} orphaned IDs from S3 [no match in oc_filecache]")

db_orphans = just_files.difference(s3_files)
print(f"Detected {len(db_orphans)} orphaned file IDs from DB [no match in S3]")

empty_not_empty = empty_folders.intersection(s3_files)
print(f"Detected {len(empty_not_empty)} empty folders from oc_files that have an S3 match")

yields:

Detected 0 orphaned file previews from Nextcloud database
Detected 125040 files from S3 bucket
Detected 2735513 entries (files + folders) from oc_filecache
Detected 126568 files from oc_filecache
Detected 2547956 empty folders from oc_files
Detected 38 orphaned IDs from S3 [no match in oc_filecache]
Detected 2011 orphaned file IDs from DB [no match in S3]
Detected 296 empty folders from oc_filecache that have an S3 ID match

The 296 empty folders (from oc_filecache) matched with an S3 object are, again, appdata_XXXXXXXXXXX/preview/ kind of folders. Looking at the timestamps, they seem to match background calls to https://apps.nextcloud.com/apps/previewgenerator which might indicate it’s a bad idea for S3. I also have some .git/objects in there.