PDF Viewer seems to be breaking text/glyphs

Hi Nextcloud Support,

My PDF viewer on Nextcloud seems to be breaking text and I am uncertain why this issue seems to be persisting. Some days and at some times when I open a PDF that has been exported from our accounting software, all of my text appears broken and in strange glyphs.

This seems to only occur at certain days or certain times and then returns back to it’s original form on other days or times. I suspect this may be in tandem with maintenance times however I am not certain for this.

Please help!

Hi @David_Tauro,

This looks like a font rendering issue in pdf.js (the JavaScript-based PDF engine that Nextcloud’s viewer is built on), rather than a Nextcloud problem as such.

PDFs exported from accounting software often embed custom fonts with a non-standard internal character mapping (called a ToUnicode table). When pdf.js can’t correctly parse that mapping — or fails to load the embedded font data — it falls back to a default encoding, which maps the bytes to completely wrong characters. That’s exactly what your screenshot shows.

Interestingly, some lines in your screenshot render perfectly fine (Custom-Fabricated Exterior Front & Side, Faces: 3/16" white lexan, etc.). Those lines likely use a different, standard-compliant font within the same document. The garbled lines come from a different font that pdf.js struggles with.

The intermittent nature is likely related to browser-side font caching: pdf.js caches font data in the browser, and depending on whether that cache is warm or has been cleared (e.g. after a browser restart, or after Nextcloud pushes an update that invalidates cached assets), the rendering either works or fails.

To narrow down whether the problem is in the PDF itself or in pdf.js specifically, you can run two quick diagnostics on the affected file. This requires poppler-utils, which is available on Linux and macOS:

# Install if needed:
# Debian/Ubuntu:  sudo apt install poppler-utils
# macOS:          brew install poppler

# 1. Inspect the fonts embedded in the PDF:
pdffonts yourfile.pdf

# 2. Try to extract the text:
pdftotext yourfile.pdf -

The pdffonts output will show for each embedded font whether a ToUnicode mapping is present (column uni). If some fonts show no there, that’s a strong indicator.

The pdftotext result is even more direct: if the extracted text is also garbled, the problem is in the PDF file itself — independent of pdf.js or Nextcloud. If the text comes out correctly, the PDF is fine and the issue is specific to pdf.js’s rendering.

A quick sanity check in the meantime: please download the PDF and open it in a native PDF viewer (Adobe Acrobat, your OS’s built-in viewer, etc.). If it renders correctly there, that already points toward a pdf.js issue.

If that turns out to be the case, the best place to report it is the pdf.js project directly: Issues · mozilla/pdf.js · GitHub — ideally with the PDF attached (or a minimal reproduction). Nextcloud’s PDF viewer has limited ability to fix font rendering bugs that originate in pdf.js itself.

h.t.h.


ernolf