Dirty table reads -> excessive logging - reproducible

Intro

After the last update from 28.0.6 to 29.0.1, I realized that my logfile was quite large. It was larger than 50MB, which is an odd size for my instance.

I am the only user and rotate yearly; so far, the log was never larger than 2,5 MB.

(trying to) analyze the issue

Using awk I managed to identify 6 (only six!) lines that made up almost the entire 50 megabytes. 3 lines were more than 10 MB long (line length 108739898) and another 3 were around 5444091 long.

This prints the length of the longest lines in your log, and it also indicates the line number:
awk '{print length " - line: " NR}' nextcloud.log |sort -rn|head

Timeline and possible cause (Please chime in)

Extract of updater.log

2024-05-31 09:24:23 updater cli is executed
2024-05-31 09:27:31 update of code successful.
2024-05-31 09:27:31 updater run in non-interactive mode - occ upgrade is started
2024-05-31 09:28:01 updater run in non-interactive mode - disabling maintenance mode
2024-05-31 09:28:07 maintenance mode is disabled - return code: 0

So far, so good. We’ve now established the exact times of the upgrade and can correlate with the nextcloud.log.

The first very long line in the logfile is logged at 2024-05-31 09:27:47 - so it happens during the upgrade process. Obviously, it wouldn’t make sense to post even a single line here (larger than 10 MB, remember?). I’ll go with the first 220 characters of each line.

Extract of nextcloud.log (only the very long lines)

{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `path` FROM `*PREFIX*filecache`
{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `fileid` FROM `*PREFIX*filecache
{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `path` FROM `*PREFIX*filecache`
{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `filecache`.`fileid`, `storage`,
{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `filecache`.`fileid`, `storage`,
{"reqId":"OdkfrSYEW9oLOZbz1rcO","level":0,"time":"2024-05-31T09:27:47+00:00","remoteAddr":"","user":"--","app":"no app in context","method":"","url":"--","message":"dirty table reads: SELECT `filecache`.`fileid`, `storage`,
  • All six lines have the same reqId.
  • All six lines are about “dirty table reads”

Reproducible

I removed the overly long lines from my logfile. Afterwards, the logfile was down to 2,3 MB.
I just updated from 29.0.1 to 29.0.2 and my logfile is now a whopping 70MB. I checked and again there were these same very, very long lines.

Conclusion

None so far - this is a work in progress. I’ll update my findings, hoping that someone can shed some light on this. Perhaps others may find this useful.

Switch to → a lower loglevel ← ! Loglevel “0” is the most verbose level.

You can switch to 3, that is enough!


Much and good luck,
ernolf

Thank you, I keep my loglevel set to 3. Loglevel 0 is set by the updater only during the upgrade and that’s a good thing IMHO.

And it’s not the issue :slight_smile: - the overly long lines are.

But all the log entries you posted above, are exclusively loglevel 0 :thinking:


ernolf

Correct - that’s because it happens during the upgrade.
And, as explained above, the updater sets the loglevel to 0 during its run.

OK. Now even I understood it :face_with_peeking_eye:


Those dirty table reads appear only to occur on the filecache and the storage tables.

This is what I would do if I would get those messages:

Reindexing all files from scratch:

occ files:scan --all

… and if that does not solve the issue, I would once more give this a try:

occ maintenance:repair --include-expensive

… but I do not have much hope about that one, because all repair jobs are normaly executed after an update…


Much and good luck,
ernolf

:joy: no worries, thank you for your input!

I did both, files:scan and maintenance:repair --include-expensive (with the latter not showing anyting of interest). Will verify when the next update comes and update my post accordingly.

1 Like

It happened again, this time the max line length in the logfile is 11311844 and the file is 121MB instead of usually around 70k - 100k.

I would suggest digging into the log line and if it doesn’t relate to a “good” transaction e.g. extending DB schema please open a Github issue with required details… it is pointless keep posting “it happened again” as long there is no news e.g. more details what and how it happens

I was advised to try some things and promised to report back. Which I did.

Unfortunately without seeing the rest of one of those dirty table reads: [...] we’re flying blind. The stack trace is contained in the rest of it.

What does the bulk of the line appear to contain? Is the bulk of huge line length the SQL query itself? I can’t image the stack trace (excluding the SQL query) is actually that long. Maybe you can find the trace and just post it?

Refs:

The trace has 121 megabytes, so I can’t post it. I checked and it is indeed a single trace. I had to change the extension to be able to upload, it’s a 7z file and not odt.
trace.7z.odt (880,9 KB)