Post-Mortem Analysis Composer Authoritative Classloader

We faced some critical update issues with Calendar the last few days. This is an analysis of what went wrong and how we could fix the situation.

How it all started

I noticed Blackfire.io warnings about too many autoloader filesystem checks a few months ago. Nextcloud, all shipped apps and the majority of optional apps come with a Composer autoloader for PHP classes. Put simply, the autoloader is a function that returns the class file name for a given class name. The many autoloaders active in Nextcloud are used like a chain. When a class is to be loaded, class loaders are called in order until the first one can resolve the class. This works but has a performance penalty if all autoloaders have to check if the file exists in the filesystem.

Autoloader optimization

Composer autoloaders can be optimized. In its simplest form the autoloader can contain a map of class names to file names. When the autoloader is asked about a known class, it can return a positive result without checking the filesystem. Only unkown classes need a filesystem test. The class may or may not be loadable.

A more advanced and restrictive optimization level is the authoritative autoloader. It has a class map, too, but does not load unknown classes. This means it can return both positive and negative results without a filesystem check.

Shipped apps

For that reason Nextcloud and the shipped apps have used authoritative classloaders for a while. When a class is loaded, many autoloaders are still asked to load a file but they don’t go to the filesystem. When Nextcloud is updated, the code and the autoloaders are replaced with updated class maps. So far, so good.

Optional apps from the app store

I’m fairly familiar with the composer optimizations so I gave them a try for the local groupware apps Calendar, Contacts and Mail. The results were promising when analyzed with Blackfire.io: the number of filesystem finds went from 795 to 101 on my development environment. Heureka.

Digging deeper

I analyzed the autoloader setup further and found out that there is also a classloader for each app, provided by Nextcloud. It is there as a default, but if apps have an autoloader, there is no real need. Forcing Nextcloud to disable this dynamic autoloader, again, improved the benchmarks and Blackfire.io completely hid its classloader filesystem warning.

Nextcloud 27

Working towards Nextcloud 27 we did a series of pre-releases for early testing in pre-production and QA. One of the early releases, alpha 2, caused an upgrade error for one single instance but the author was able to resolve it with a occ upgrade after the failed upgrade.

We put out more pre-releases and tested the app ourselves. All seemed fine.

When the final release hit the app store, we quickly received more reports of the error in alpha 2. The upgrade fails because then Calendar is updated, the latest database migration class can’t be loaded. The error could be linked to the autoloader changes so we reverted the authoritative optimization and published a patch update. Yet the error continued to show because instances came from an authoritative classloader.

To explain what went wrong we can have a look at the optimized class loaders of each app version

  1. v4.2 did not have an optimized classloader
  2. v4.3 is the first version with an authoritative classloader
  3. v4.4.0 still had the authoritative class loader and a new migration class that is needed during the app update
  4. v4.4.1 did not have an authoritative class loader anymore but still contained the new migration class

Apps are updated by replacing their code. Every Nextcloud process first loads all apps, then does the main task. For the update, this means that an active app is loaded before it gets replaced. The same applies to the autoloader – the old version is loaded before its code is replaced.

Since v4.3 had an authoritative classloader, v4.4 ships a new file and the class loader doesn’t get updated within the update process, the classloader is not able to load the migration class. v4.4.0 and v4.4.1 behave the same. In other words, for the update it doesn’t matter what the new autoloader does. It is the autoloader of the previous version that loads or doesn’t load classes.

Accepting our fate

There is no reasonable way to make the old, authoritative autoloader load the new file. It is simply too restrictive. Therefore we reverted the feature that brings the migration class to have a release that can be updated to without errors. Moreover, the removed authoritative autoloader was backported to v4.3. Within the next days all instances using the Calendar app will see an update. This update has a more flexible autoloader that can be used for new files too. Soon we will add back the temporarily reverted feature with its migration class that can be loaded on installations with v4.3.5 and v4.4.2.

I have updated the developer documentation with my findings in the hopes of preventing other developers to add the tempting authoritative classloader optimization to their apps.

PR to fix Calendar: https://github.com/nextcloud/calendar/pull/5295

12 Likes