Many integrity failures - how to remove all EXTRA_FILES automatically?

Nextcloud version (eg, 20.0.5): 22.2.3
Operating system and version (eg, Ubuntu 20.04): Debian 9
Apache or nginx version (eg, Apache 2.4.25): Apache 2.4.25
PHP version (eg, 7.4): PHP-FPM 7.4

The issue you are facing:
I see an ‘integrity failure’ notice in my Nextcloud settings - there’s about 100 EXTRA_FILES notices and a single INVALID_HASH notice for my .htaccess file. This is probably caused by not running the occ command from within the nextcloud directory.

Is this the first time you’ve seen this error? (Y/N): N
I have seen the same error with previous versions as well, but I did not check if the same files are singled out.

Question:

  1. Is there an occ command (or other) I can use to automatically delete these files? Doing this manually seems quite tedious, and I’m willing to take the chance that this operation will not nuke my installation.
  2. What is the purpose of adding the “raw output” of a php print_r() command to the list of files that did not pass the integrity check? Trying to use that output as input for a (php) routine to automate the deletion is non-trivial… If that raw output would have been shown as a json array, then I would have understood the hidden message (“we do not want to take responsibility for automating the destruction of your installation, but if you are willing to take that responsibility, here you go”)…

Nevermind - I found a function on php.net to “reverse” the output of a print_r() into an array and added a few loops and checks to automate the removal of ± 500 EXTRA_FILES on my server.

Here it is, in case it is useful to someone else:

<?php
$array = "";

function print_r_reverse($input) {
        $lines = preg_split('#\r?\n#', trim($input));
        if (trim($lines[ 0 ]) != 'Array' && trim($lines[ 0 ] != 'stdClass Object')) {
            // bottomed out to something that isn't an array or object
            if ($input === '') {
                return null;
            }

            return $input;
        } else {
            // this is an array or object, lets parse it
            $match = array();
            if (preg_match("/(\s{5,})\(/", $lines[ 1 ], $match)) {
                // this is a tested array/recursive call to this function
                // take a set of spaces off the beginning
                $spaces = $match[ 1 ];
                $spaces_length = strlen($spaces);
                $lines_total = count($lines);
                for ($i = 0; $i < $lines_total; $i++) {
                    if (substr($lines[ $i ], 0, $spaces_length) == $spaces) {
                        $lines[ $i ] = substr($lines[ $i ], $spaces_length);
                    }
                }
            }
            $is_object = trim($lines[ 0 ]) == 'stdClass Object';
            array_shift($lines); // Array
            array_shift($lines); // (
            array_pop($lines); // )
            $input = implode("\n", $lines);
            $matches = array();
            // make sure we only match stuff with 4 preceding spaces (stuff for this array and not a nested one)
            preg_match_all("/^\s{4}\[(.+?)\] \=\> /m", $input, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
            $pos = array();
            $previous_key = '';
            $in_length = strlen($input);
            // store the following in $pos:
            // array with key = key of the parsed array's item
            // value = array(start position in $in, $end position in $in)
            foreach ($matches as $match) {
                $key = $match[ 1 ][ 0 ];
                $start = $match[ 0 ][ 1 ] + strlen($match[ 0 ][ 0 ]);
                $pos[ $key ] = array($start, $in_length);
                if ($previous_key != '') {
                    $pos[ $previous_key ][ 1 ] = $match[ 0 ][ 1 ] - 1;
                }
                $previous_key = $key;
            }
            $ret = array();
            foreach ($pos as $key => $where) {
                // recursively see if the parsed out value is an array too
                $ret[ $key ] = print_r_reverse(substr($input, $where[ 0 ], $where[ 1 ] - $where[ 0 ]));
            }

            return $is_object ? (object)$ret : $ret;
        }
    }

$arr = print_r_reverse($array);
$list=[];

foreach( $arr as $k => $v)
{
  foreach( $v as $key => $val )
  {
    if( $key !== 'EXTRA_FILE' ) continue;
    foreach( $val as $path => $hash )
    {
      $prefix = ($k != 'core') ? '/var/www/nextcloud/apps/'.$k : '/var/www/nextcloud';
      $file = $prefix . '/' . $path;
      if( file_exists($file) ) $list[] = $file;
    }
  }
}

if( count($list) == 0 ) { echo "Nothing to do.\r\n"; die(); }

foreach( $list as $file )
{
  if( unlink($file) )
  {
    echo "File deleted: $file.\r\n";
  }
  else
  {
    echo "Failed to delete: $file.\r\n";
  }
}

?>

Put this script in a file, make it executable, copy the Raw output into the $array variable and then run the script with PHP (as a user that has permissions to delete files in your nextcloud directory…)

Use at your own risk. It worked for me (although I had to re-scan and use it multiple times before I finally saw the green notice “all checks passed”)