How to remove files in Talk folder after specified time

In our company, the Talk app is actively used. But now we have the problem that the empty hard drive capacity is slowly coming to an end and I had to realize that a lot of the used space is in the Talk folder. These files are usually only used for a short time, as important documents are archived accordingly.

Is there a possibility that in all Talk folders of all users the files are automatically removed after 1 month after the last change?

That is very easy to do with basic Linux tools. A simple script run once a day by cron should do the work:

/usr/local/bin/nc-cleanup-talkdir.sh

#!/bin/bash
#
# Copyright 2023 [ernolf] Raphael Gradenwitz
#                      raphael.gradenwitz(at)googlemail.com
# simple script to clean the Talk dir from files older than
# $TIME_LIMIT days
#

### !!
# Set the time limit for deleting files (in days)
TIME_LIMIT=28  # 4 weeks

### !!
## change this dir to your needs:
ncdir='/var/www/nextcloud'

# error handling
error() { echo "$*" >&2; exit 1; }

# verbose-echo
[ "$1" = "-v" ] && verbose=: || verbose=false
v_echo() { $verbose && echo "$*"; return 0; }

# check if jq exists
if ! which jq >/dev/null 2>&1; then
    error "`jq` binary not found. Please install first"
fi

v_echo "Begin cleaning at `date`"

# pick out ht-user
htuid="`ls -l $ncdir/config/config.php | awk '{print $3}'`"

# the occ command call is saved under $occ_call,
# adapted to the user and environment
occ_call="php -f $ncdir/occ"
v_echo "  default occ invocation: \"$occ_call\""
if [ "`id -un`" != "$htuid" ]; then
    occ_call="sudo -u $htuid $occ_call"
    v_echo "   - but you're not '$htuid', therefore it must be invoked with sudo"
    v_echo "  effective occ invocation: \"$occ_call\""
fi

scan_dir() {
    $verbose && quiet="" || quiet="--quiet"
    $occ_call files:scan --no-interaction $quiet --path="$1"
}

# basic data
dbname="$($occ_call config:system:get dbname)"
dbprfx="$($occ_call config:system:get dbtableprefix)"
dbuser="$($occ_call config:system:get dbuser)"
dbpw="$($occ_call config:system:get dbpassword)"
datadir="$($occ_call config:system:get datadirectory)"
dbstrg="mysql -u $dbuser -p$dbpw --disable-auto-rehash --default-character-set=utf8mb4 $dbname -srNe"

for uid in $(echo $($occ_call user:list --output=json | jq -r 'keys_unsorted[] | @base64')); do
    uid="$(echo "$uid" | base64 --decode)"
    v_echo "Processing User \"$uid\":"

    if $($occ_call guest:list --output=json 2>/dev/null | jq --arg email "$uid" 'map(.email == $email) | any'); then
        v_echo " - skipping guest-account"
    else
        selectstrg="SELECT configvalue FROM ${dbprfx}preferences WHERE userid = '$uid' AND appid = 'spreed' AND configkey = 'attachment_folder'"
        talkfolder=$($dbstrg "$selectstrg")
        talkdir="${datadir:=$ncdir/data}/$uid/files${talkfolder:=/Talk}"
        if [ -d "$talkdir" ]; then
            v_echo " - Talk directory found: \"$talkdir\""
            FILES_DELETED=false
            while IFS= read -r file; do
                v_echo " - Processing file: $file"
                rm "$file"
                FILES_DELETED=true
            done < <(find "$talkdir" -maxdepth 1 -type f -mtime +"$TIME_LIMIT")
            $FILES_DELETED && scan_dir "/$uid/files$talkfolder" || v_echo " - Nothing changed"
            unset talkfolder
        else
            v_echo " - No Talk directory, skipping account"
        fi
    fi
done

v_echo "Finished cleaning at `date`"

(Last changed 04.05.2023; 09:05 CEST)

now make it executable:

chmod +x /usr/local/bin/nc-cleanup-talkdir.sh

It requires jq: (apt-get install jq )

You can run the script either as root or as your webserver user.

you can create a cron entry like:

0 3 * * * /usr/local/bin/nc-cleanup-talkdir.sh

or if you prefer to use /etc/crontab:

0 3 * * * www-data /usr/local/bin/nc-cleanup-talkdir.sh

(change username accordingly)

If you need a log output, call it this way:

/usr/local/bin/nc-cleanup-talkdir.sh -v >> /path/to/logfile

This should be enough for your IT-department to implement a solid solution
I make no guarantees or warranties of any kind.

Even though I have created this post with the greatest possible care, I know with certainty that I (as usual) made at least small mistakes. If you find any inaccuracies please point them out to me, I will correct them immediately if possible or your comment will be the correction.

Happy hacking


EDIT:

Since the Talk folder is configurable for every user, I had to change the script, now it does a sql-query and looks if any individual Atachment-Folders are configured. Therefor the script now only works for instances with Mysql/MariaDB. It is not much of an effort, to change the query strings for Postgres. I do not use Postgres, you should do that change yourself
If you have more complicated database configurations regarding host and such, then something may need to be adjusted.

The vallues for dbname, dbprfx, dbuser, dbpw and datadir are now auto-detected in this manner:

dbname="$($occ_call config:system:get dbname)"
dbprfx="$($occ_call config:system:get dbtableprefix)"
dbuser="$($occ_call config:system:get dbuser)"
dbpw="$($occ_call config:system:get dbpassword)"
datadir="$($occ_call config:system:get datadirectory)"

that is extremely slow. You can enter the hardcoded variables in your script in order to speed up the script process a little bit if you like.

(Last changed 04.05.2023; 09:05 CEST)

Your script assumes that the files are always stored in a folder called “Talk”. However, IIRC the name of the folder is configurable for every user.

1 Like

Thank you! That is a good one, I did not think that far.

I’ll make an update taking that into account.

Fixed! (updated post above)
Thank you very much for the valuable hint.

1 Like