How to remove bookmark from invalid links list?

My environment: Nextcloud 19.0.4, Bookmarks 3.4.5

Hi @marcelklehr
as I opened the bookmarks app today I saw that two bookmarks have been moved to the “invalid links” category. I checked both and found out that indeed one of them doesn’t work anymore, the other one opened the linked content instantly as I clicked on it.

At first I checked the Nextcloud log file and saw that an initial client get request failed on the now working link, because of an “403 Forbidden” error. So the bookmarks app was totally right to move the entry to the invalid links category.

No I wonder what the correct process is to get the working link off the invalid links list again. Will this be an automatic process or do I need to delete and recreate the entry manually? Does any other function exists to revalidate a link?

Thanks
Juergen

You do not need to take any action. The app periodically goes through all bookmarks to check their availability. So, the bookmark in question should soon be removed from the list of broken links. :slight_smile:

1 Like

Thanks you!

@marcelklehr Sorry for coming back to you on this, but after waiting for more than one week the mentioned two bookmarks in the “invalid links” category still exists. At least one of it should have moved away from that category, because it is definitely reachable again. If I e.g. click on it the destination page opens instantly. Can you please tell me how I can find out why the link isn’t being moved away from the category?

Mh, that would be a bug. The app only checks whether the HTTP response is a 404, but in case of other faulty status codes, the nextcloud http client throws an error without me asking it to. Thanks for the heads-up!

Working on a fix https://github.com/nextcloud/bookmarks/pull/1309

1 Like

On my NC 20.0.9 instance with Bookmarks 4.1.0 are more than 20 bookmarks in the “invalid links” category and they won’t disappear @marcelklehr ?

The problem is likely that the website returns a different result for the nextcloud crawler than for your browser.

I’ve identified for example that reddit requires the User-Agent header to be set to a common value. So this fixes that issue: CrawlService: Set UA header by marcelklehr · Pull Request #1518 · nextcloud/bookmarks · GitHub

If you are adventurous, you can try to apply the patch in that pr. If it doesn’t help you can send me (privately or publicly) a URL that still shows up as unavailable and I’ll investigate.

Sorry i am not a developer :wink:

Here are a few bookmarks that are in the invalid links category. All of them are working.

https://www.google.at/search?q=hurbenowe&gws_rd=cr&ei=kn31V7n2FpO6UqSErbgJ
https://www.alternate.at/Intel/Core-i9-10900-Prozessor/html/product/1612188
https://www.math.tu-dresden.de/modellsammlung/karte.php?ID=307
https://de.gomboc-shop.com/content/bronze-g%C3%B6mb%C3%B6c
https://ddl-music.to/
https://www.austrian.com/at/de/aktuelle-informationen-flugunregelmaessigkeiten?utm_medium=email&utm_source=raysono&utm_campaign=os_AT_de_preis-2021-kw04-at_desktop&utm_term=raysono
http://www.bildarchivaustria.at/Pages/Search/Result.aspx?p_ItemID=3
https://www.wpbeginner.com/beginners-guide/how-to-easily-add-anchor-links-in-wordpress-step-by-step/
http://www.bildarchivaustria.at/Pages/Search/Result.aspx?p_iPage=1&p_ItemID=2
https://kinoz.to/
https://hd-source.to/genres/music/

Sorry i am not a developer

No problem :slight_smile:

Thanks. I can confirm some of these. I’ll investigate.

1 Like

So, math.tu-dresden.de apparently has something funky going on with their SSL cert. Not sure why this works in the browser, but on my local machine I get the following:

$ openssl s_client -connect www.math.tu-dresden.de:443 -CApath /etc/ssl/certs
CONNECTED(00000003)
depth=0 C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, OU = Fakultaet Mathematik, CN = www.math.tu-dresden.de
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, OU = Fakultaet Mathematik, CN = www.math.tu-dresden.de
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, OU = Fakultaet Mathematik, CN = www.math.tu-dresden.de
   i:C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, CN = TU Dresden CA
 1 s:C = DE, O = Technische Universitaet Dresden, OU = ZIH, CN = TU Dresden CA - G02, emailAddress = pki@tu-dresden.de
   i:C = DE, O = DFN-Verein, OU = DFN-PKI, CN = DFN-Verein PCA Global - G01
 2 s:C = DE, O = DFN-Verein, OU = DFN-PKI, CN = DFN-Verein PCA Global - G01
   i:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2
 3 s:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2
   i:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2

Perhaps someone else can shed some light on this :slight_smile:

Edit: Something similar is going on with https://de.gomboc-shop.com

If you want to verify the certificate chain all certificates of the chain need to be available in some way and at least the final CA root certificate should be on your server too. Let’s check how the chain looks like (taken from the transferred certificates):

Certificate chain
 0 s:C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, OU = Fakultaet Mathematik, CN = www.math.tu-dresden.de
   i:C = DE, ST = Sachsen, L = Dresden, O = Technische Universitaet Dresden, CN = TU Dresden CA
 1 s:C = DE, O = Technische Universitaet Dresden, OU = ZIH, CN = TU Dresden CA - G02, emailAddress = pki@tu-dresden.de
   i:C = DE, O = DFN-Verein, OU = DFN-PKI, CN = DFN-Verein PCA Global - G01
 2 s:C = DE, O = DFN-Verein, OU = DFN-PKI, CN = DFN-Verein PCA Global - G01
   i:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2
 3 s:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2
   i:C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2

So lets have a closer look on the certificates provided by the server.

  • The first certificate (CN = www.math.tu-dresden.de) has been issued by “CN = TU Dresden CA”.
  • The subject of the second transferred certificate is “CN = TU Dresden CA - G02”!!

Due to the fact that a certificate chain relies on subject/issuer CN string pairs on which the verification hashes are build, you will see that there is a discrepancy between the issuer of the server certificate (CN = TU Dresden CA) and the provided next intermediate certificate (**CN = TU Dresden CA - G02).

A browser is very often able to use its own certificate cache to verify it, but on a Linux server you need to provide that certificates in the CA path.

So my conclusion is, that TU Dresden provides an invalid chain certificate which causes the problem if you try to access that page from your server.

1 Like

Ah, thanks j-ed, I had a hunch about that :slight_smile:

Short summary about the others that still fail after the above fix:

https://hd-source.to/genres/music/

…fails (also on this forum) because it uses CoudFlare DDoS protection, which “notices” the crawler is not a real browser.

fails because it runs into the 10s timeout for some reason.

https://www.alternate.at

And the alternate.com URL responds with an empty response for some reason.