Mimetype Checker script for files without or with wrong file extensions for "Flow external script"

Have you ever encountered the problem of not being able to view image files on Nextcloud due to missing file extensions? I sure did! Whenever I used the Nextcloud android app to automatically upload my “WhatsApp Media” folder, a large number of image files would end up without file extensions, making them impossible to view. These files had cryptic names that made it difficult to identify them, but I only realized they were images after examining them more closely.

I figured that I might not be the only one facing this issue, so I came up with a solution. I wrote a script that checks the MIME types of files within the Nextcloud data directory and verifies their file extensions. If the extensions are incorrect, the script can make the necessary changes or appends them. This way, the files are automatically recognized and renamed, making them visible right from the start. If you are facing a similar issue, my script might just be the solution you need!

It integrates with Nextcloud’s Command Line API, occ, and is intended to be used with the “Flow external script” feature.

To use this script with “Flow external script,” create rules in the Nextcloud admin settings:
http[s]://%your-nextcloud%/settings/admin/workflow


Example rules for jpeg- an png-Files

Here’s the help output for the script:

  nc-mimecheck - version 2024-02-28 13:33 (latest version)

  Usage:
    nc-mimecheck [ -r ] [ --detector=detector ] [ -d | --dryrun ] [ -v | --verbose ] [ -vv | --debug ] [--] <user_id or path/to/dir/or/file>
    nc-mimecheck [ -h | --help ]
    nc-mimecheck [ -l | --listmimes=(i|v|a|t) ]
    nc-mimecheck integrity_check

       The "path/to/dir/or/file" argument can be either an absolute or
       relative path within the Nextcloud data directory. This script
       only checks files of enabled users and does not yet support group
       folders.

  -> ! Please note: when the path is preceded by other arguments, -- is
  -> ! absolutely necessary to mark the end of the options!

Options:
  -r                         Recursive.

      --detector=detector    Either "mimetype" (default) or "file"
                             `mimetype` uses`mimetype -b $file`
                             `file` uses`file -b --mime-type $file`

  -h, --help                 Prints this help information and exits.
                             All other arguments are ignored when using
                             this option.

  -l, --listmimes="type"     Lists supported MIME types, where
                             "type" can either be "all" or one (or a
                             combination) off: i (image), v (video),
                             a (audio) or t (text)
                             Defaults to all
                             All other arguments are ignored when using
                             this option.

  -d, --dryrun               Shows what the script would do without
                             making any actual changes.

  -q, --quiet                Runs the script in quiet mode with no
                             echoes.

  -v, --verbose              Provides verbose output.

  -vv, --debug               Provides debug output.

  integrity_check            verify the integrity of this script with signature

Examples:
  Process a single file "user/files/path/to/file" in quiet mode:
  nc-mimecheck -q -- "user/files/path/to/file"
  (this is the behaviour if invoked by "Flow external script")

  Process an absolute path "/path2/nextcloud/data/user/files/path":
  nc-mimecheck "/path2/nextcloud/data/user/files/path"

  Process a relative path "user/files/path/to/dir" recursively:
  nc-mimecheck -rd -- "user/files/path/to/dir"

  Analyze a directory recursively with dry run and verbose output:
  nc-mimecheck -vrd -- "/path2/nextcloud/data/user/files/path/to/dir"


  This script checks the MIME types of files within Nextcloud's data
  directory and verifies their file extensions. If the extensions are
  incorrect, it will make the necessary changes or append them. It
  integrates with Nextcloud's Command Line API, `occ`, and is intended
  to be used with the "Flow external script" feature.

  To use this script with "Flow external script," create rules in the
  Nextcloud admin settings:
          http[s]://%your-nextcloud%/settings/admin/workflow

  Example rule for jpeg-Files:
    When [File created]
      and [File MIME type] [is] [Custom mimetype] [image/jpeg]
      and [File name] [does not match] [/^.*\.(jpe?g)$/i]
   >_ Pass files to external scripts for processing outside of Nextcloud
            [/usr/local/bin/nc-mimecheck -q %n]

And here is the output of the -l options, which lists the supported MIME types:

with i for images:

ernolf@mybox:~# nc-mimecheck -l i

 -  supported image mimetypes:
    +---------------------------+-----------+--------------------+
    | Mimetype                  | Extension | Regex              |
    +---------------------------+-----------+--------------------+
    | image/jpeg                | .jpg      | ^.*\.(jpe?g)$      |
    | image/jp2                 | .jp2      | ^.*\.(jp[2x]|j2k)$ |
    | image/png                 | .png      | ^.*\.(png)$        |
    | image/bmp                 | .bmp      | ^.*\.(bmp)$        |
    | image/x-bmp               | .bmp      | ^.*\.(bmp)$        |
    | image/x-ms-bmp            | .bmp      | ^.*\.(bmp)$        |
    | image/gif                 | .gif      | ^.*\.(gif)$        |
    | image/tiff                | .tif      | ^.*\.(tiff?)$      |
    | image/tiff-fx             | .tif      | ^.*\.(tiff?)$      |
    | image/svg+xml             | .svg      | ^.*\.(svg)$        |
    | image/x-xcf               | .xcf      | ^.*\.(xcf)$        |
    | image/x-icon              | .ico      | ^.*\.(ico)$        |
    | image/vnd.microsoft.icon  | .ico      | ^.*\.(ico)$        |
    | image/x-icns              | .icns     | ^.*\.(icns)$       |
    | image/webp                | .webp     | ^.*\.(webp)$       |
    | image/vnd.adobe.photoshop | .psd      | ^.*\.(psd)$        |
    | image/x-photoshop         | .psd      | ^.*\.(psd)$        |
    | image/vnd.djvu            | .djvu     | ^.*\.(djvu)$       |
    | image/vnd.djvu+multipage  | .djvu     | ^.*\.(djvu)$       |
    | image/x-djvu              | .djvu     | ^.*\.(djvu)$       |
    | image/x-canon-cr2         | .cr2      | ^.*\.(cr2)$        |
    | image/x-canon-crw         | .crw      | ^.*\.(crw)$        |
    | image/x-fuji-raf          | .raf      | ^.*\.(raf)$        |
    | image/x-kodak-dcr         | .dcr      | ^.*\.(dcr)$        |
    | image/x-kodak-k25         | .k25      | ^.*\.(k25)$        |
    | image/x-kodak-kdc         | .kdc      | ^.*\.(kdc)$        |
    | image/x-minolta-mrw       | .mrw      | ^.*\.(mrw)$        |
    | image/x-nikon-nef         | .nef      | ^.*\.(nef)$        |
    | image/x-nikon-nrw         | .nrw      | ^.*\.(nrw)$        |
    | image/x-olympus-orf       | .orf      | ^.*\.(orf)$        |
    | image/x-panasonic-raw     | .raw      | ^.*\.(raw)$        |
    | image/x-pentax-pef        | .pef      | ^.*\.(pef)$        |
    | image/x-samsung-srw       | .srw      | ^.*\.(srw)$        |
    | image/x-sony-arw          | .arw      | ^.*\.(arw)$        |
    | image/x-sony-sr2          | .sr2      | ^.*\.(sr2)$        |
    | image/x-sony-srf          | .srf      | ^.*\.(srf)$        |
    | image/heic                | .heic     | ^.*\.(hei[cf])$    |
    | image/heic-sequence       | .heic     | ^.*\.(hei[cf])$    |
    | image/heif                | .heic     | ^.*\.(hei[cf])$    |
    +---------------------------+-----------+--------------------+

with v for video:

ernolf@mybox:~# nc-mimecheck -l v

 -  supported video mimetypes:
    +----------------------------------+-----------+---------------------+
    | Mimetype                         | Extension | Regex               |
    +----------------------------------+-----------+---------------------+
    | video/3gpp                       | .3gp      | ^.*\.(3gp)$         |
    | video/3gpp2                      | .3g2      | ^.*\.(3g2)$         |
    | video/h261                       | .h261     | ^.*\.(h261)$        |
    | video/h263                       | .h263     | ^.*\.(h263)$        |
    | video/h264                       | .h264     | ^.*\.(h264)$        |
    | video/jpeg                       | .jpgv     | ^.*\.(jpgv)$        |
    | video/jpm                        | .jpm      | ^.*\.(jpg?m)$       |
    | video/mj2                        | .mj2      | ^.*\.(mjp?2)$       |
    | video/mp2t                       | .ts       | ^.*\.(m?ts)$        |
    | video/mp4                        | .mp4      | ^.*\.(mp4v?|m4v)$   |
    | video/mpeg                       | .mpg      | ^.*\.(mpe?g?|vob)$  |
    | video/ogg                        | .ogv      | ^.*\.(ogv)$         |
    | video/quicktime                  | .mov      | ^.*\.(mov)$         |
    | video/vnd.dvb.file               | .dvb      | ^.*\.(dvb)$         |
    | video/vnd.fvt                    | .fvt      | ^.*\.(fvt)$         |
    | video/vnd.mpegurl                | .mxu      | ^.*\.(m[x4]u)$      |
    | video/vnd.ms-playready.media.pyv | .pyv      | ^.*\.(pyv)$         |
    | video/webm                       | .webm     | ^.*\.(webm)$        |
    | video/x-f4v                      | .f4v      | ^.*\.(f4v)$         |
    | video/x-fli                      | .fli      | ^.*\.(fli)$         |
    | video/x-flv                      | .flv      | ^.*\.(flv|f4f)$     |
    | video/x-m4v                      | .m4v      | ^.*\.(m4v|mp4)$     |
    | video/x-matroska                 | .mkv      | ^.*\.(mk[vs]|mk3d)$ |
    | video/x-mng                      | .mng      | ^.*\.(mng)$         |
    | video/x-ms-asf                   | .asf      | ^.*\.(as[fx])$      |
    | video/x-ms-vob                   | .vob      | ^.*\.(vob)$         |
    | video/x-ms-wm                    | .wm       | ^.*\.(wm)$          |
    | video/x-ms-wmv                   | .wmv      | ^.*\.(wmv)$         |
    | video/x-ms-wmx                   | .wmx      | ^.*\.(wmx)$         |
    | video/x-ms-wvx                   | .wvx      | ^.*\.(wvx)$         |
    | video/x-msvideo                  | .avi      | ^.*\.(avi)$         |
    | video/x-sgi-movie                | .movie    | ^.*\.(movie)$       |
    +----------------------------------+-----------+---------------------+

with a for audio:

ernolf@mybox:~# nc-mimecheck -l a

 -  supported audio mimetypes:
    +--------------------+-----------+---------------------+
    | Mimetype           | Extension | Regex               |
    +--------------------+-----------+---------------------+
    | audio/aac          | .aac      | ^.*\.(aac)$         |
    | audio/aiff         | .aif      | ^.*\.(aiff?)$       |
    | audio/alac         | .m4a      | ^.*\.(m4a)$         |
    | audio/amr          | .amr      | ^.*\.(amr)$         |
    | audio/basic        | .au       | ^.*\.(au|snd)$      |
    | audio/flac         | .flac     | ^.*\.(flac)$        |
    | audio/mid          | .mid      | ^.*\.(mid|i[0-9]l)$ |
    | audio/midi         | .mid      | ^.*\.(mid|i[0-9]l)$ |
    | audio/mp3          | .mp3      | ^.*\.(mp3)$         |
    | audio/mp4          | .m4a      | ^.*\.(m4a)$         |
    | audio/mpeg         | .mp3      | ^.*\.(mp3)$         |
    | audio/ogg          | .ogg      | ^.*\.(og[ga])$      |
    | audio/s3m          | .s3m      | ^.*\.(s3m)$         |
    | audio/silk         | .sil      | ^.*\.(sil)$         |
    | audio/vnd.wave     | .wav      | ^.*\.(wave?)$       |
    | audio/webm         | .weba     | ^.*\.(weba)$        |
    | audio/wav          | .wav      | ^.*\.(wave?)$       |
    | audio/x-aac        | .aac      | ^.*\.(aac)$         |
    | audio/x-aiff       | .aif      | ^.*\.(aiff?)$       |
    | audio/x-flac       | .flac     | ^.*\.(flac)$        |
    | audio/x-m4a        | .m4a      | ^.*\.(m4a)$         |
    | audio/x-mid        | .mid      | ^.*\.(mid|i[0-9]l)$ |
    | audio/x-midi       | .mid      | ^.*\.(mid|i[0-9]l)$ |
    | audio/x-mod        | .mod      | ^.*\.(mod)$         |
    | audio/x-mp3        | .mp3      | ^.*\.(mp3)$         |
    | audio/x-mp4        | .m4a      | ^.*\.(m4a)$         |
    | audio/x-mpeg       | .mp3      | ^.*\.(mp3)$         |
    | audio/x-ms-wma     | .wma      | ^.*\.(wma)$         |
    | audio/x-musepack   | .mpc      | ^.*\.(mpc)$         |
    | audio/x-opus+ogg   | .opus     | ^.*\.(opus|og[ga])$ |
    | audio/x-s3m        | .s3m      | ^.*\.(s3m)$         |
    | audio/x-scpls      | .pls      | ^.*\.(pls)$         |
    | audio/x-vorbis     | .ogg      | ^.*\.(og[ga])$      |
    | audio/x-vorbis+ogg | .ogg      | ^.*\.(og[ga])$      |
    | audio/x-wav        | .wav      | ^.*\.(wave?)$       |
    | audio/x-xm         | .xm       | ^.*\.(xm)$          |
    +--------------------+-----------+---------------------+

and with t for text:

ernolf@mybox:~# nc-mimecheck -l a

 -  supported text mimetypes:
    +-----------------------+-----------+---------------------+
    | Mimetype              | Extension | Regex               |
    +-----------------------+-----------+---------------------+
    | text/html             | .html     | ^.*\.(html?)$       |
    | text/css              | .css      | ^.*\.(css)$         |
    | text/javascript       | .js       | ^.*\.(js)$          |
    | text/json             | .json     | ^.*\.(json)$        |
    | application/json      | .json     | ^.*\.(json)$        |
    | text/xml              | .xml      | ^.*\.(xml|aup)$     |
    | application/xml       | .xml      | ^.*\.(xml|aup|mpd)$ |
    | text/xhtml+xml        | .xhtml    | ^.*\.(xhtml?)$      |
    | application/xhtml+xml | .xhtml    | ^.*\.(xhtml?)$      |
    | application/pdf       | .pdf      | ^.*\.(pdf)$         |
    +-----------------------+-----------+---------------------+

The script allows you to add or remove as many MIME types as desired, which are stored in arrays within the script.

The script can be executed as either root or the nextcloud user, as it automatically detects the correct user name and gathers all required information.

The script has been designed to reject paths outside of the Nextcloud data directory. The appdata_<instanceid> and updater-<instanceid> directories are blacklisted. Only files and folders inside of the <user_id>/files/* folder of enabled users are allowed to be processed, and all others are rejected.

It is not intended for interactive use and includes a dry run option to show what changes the script would make without actually modifying the files.

It is highly recommended to analyze the changes the script would make before running a recursive -r treatment, using the --verbose and --dryrun options. This will help ensure that the script does not rename files that you do not want to rename. Once you have confirmed that everything looks fine with --dryrun, you can run the script without -d to apply the changes.

This is how to install the script:

sudo wget -qO /usr/local/bin/nc-mimecheck https://global-social.net/script/nc-mimecheck
sudo chmod +x /usr/local/bin/nc-mimecheck

This would be a good first run, to get a basic understanding of the script, without making any changes at all to your files. Assuming you want to scan the files of user “tom”:

nc-mimecheck -vd tom

and to scan it recursively:

nc-mimecheck -vrd tom

Much and good luck!

continues to be maintained
(last updated: 2024.02.28 13:44)

2 Likes

Would this work to add mime for .ics file in the mail app? right now, when a calendar invite is sent from office 365 to nextcloud mail app, one cannot automatically accept the calendar invite and add it to their nextcloud calendar. In fact, the nextcloud mail app cannot even “see” the ics file.

1 Like

I am sorry, but that looks to me as a use case that is not covered by the aims of this script.

The files must be located on the filesystem inside of a users file-structure. The Mail app makes no use of it as far as I know. I am actualy not using the Mail App but I did use it once (only for test purposes) and it basicaly is an imap client, so you wil have to use other means to check the atachments on your mailserver.

?

1 Like

Thanks for the work!

Is it possible to modify it for Nextcloud instances running in container? Since NC_Dir is set, I am running into the problem with that error

I can not detect a Nextcloud installation

Thank you for your feedback.
To your question:

Personally, I have nothing to do with containers and think very little of them. I don’t have any test environments (so far) in which I can develop this.
But I wil set it on my todo list, and as soon as I find time to do that, I will develop a solution, since it will work than on all of my scripts.
But I can say right away that it will take at least a few months.

ernolf

1 Like

Great job on tackling and solving that issue! Your script sounds like a handy tool for those facing a similar problem with image files on Nextcloud. For more tech solutions and insights, make sure to check out [REMOVED/SPAM], your go-to source for tech-related information and resources.

Fine job, ernolf, thank you! But in my case, the script doesn’t work at all because Nextcloud is installed with Simple Storage Service (S3) as primary storage:

svalx@srv:~> sudo /usr/local/bin/nc-mimecheck -vdr svalx
VERBOSE: all available options after parsing arguments:
VERBOSE:  - detector="mimetype -b"
VERBOSE:  - listmimes=false
VERBOSE:  - recursive=true
VERBOSE:  - dryrun=true
VERBOSE:  - verbose=true
VERBOSE:  - debug=false
VERBOSE:  - quiet=false
VERBOSE:  - bak=false
VERBOSE:  - enabled_user=false
VERBOSE:  - basepath="svalx"
VERBOSE:  - the nextcloud data directory is "/srv/nextcloud_data"
VERBOSE:  - relative basepath now is: "svalx/svalx"

OPTION: 1: svalx does not exist
           path must look like: (/srv/nextcloud_data/)%USER%/files/path..
try /usr/local/bin/nc-mimecheck -h for help

Can you adapt the script for this scenario?

Hi svalx,

thank you for feedback.

The script requires the files it scans to be on a local hard drive. Mount tools like s3fs cannot do anything about this.
When an object store is used as primary storage by Nextcloud, it requires exclusive access over the bucket being used. All metadata (filenames, directory structures, etc) is stored in Nextcloud and not in the object store. The metadata is only stored in the database and the object store only holds the file content by unique identifier.
For all these reasons, I assume that it is not possible to access the files at the file system level.
Since I would not even consider hanging a burden like S3 on my neck, which only brings disadvantages, you have to accept that you are totally limited with S3.


Much and good luck,
ernolf