For some software I need to identify used/installed software on servers. Many people left
nextcloud alone and didn’t touch there anything. But I have there 360 other cases where they have changed it.
My question here is: How I safely detect Nextcloud as
nextcloud and not some other fancy name? Please take a look at this as an example: https://cloud.teamtammo.com
It is to your eye Nextcloud, for sure YOU can see it. But how can a software detect this as
Edit: There is a
/.well-known/x-nodeinfo2 for “auto-detection” of software. Maybe by default always fill it out properly? Older ways are fetching
/.well-known/nodeinfo first and then taking the
href element your software supports (by protocol version).
Edit2: Possible solution candidate: Add
<meta property="og:platform" content="nextcloud" /> to the HTML code.
Yes, that response does contain
nextcloud. The thing here is, it isn’t auto-detectable and it doesn’t have the same structure as
nodeinfo JSON responses have. So I have to add
/status.php as possible path + also
productname as a possible key too look for. At first glance it looks good, but at second it means the software has to send out yet another request to the server (already done many before, like
/nodeinfo/2.1 and same with older versions). At lot of these requests can be saved actually by providing
/.well-known/x-nodeinfo2 at least and properly formatted JSON response.
Please take a look here: https://f.haeder.net/.well-known/nodeinfo
This is a very typical reply. When you follow both
rel (only description of the JSON) and
href (the actual reply) you can see very common responses. These only cost 2 requests or even only one to
/.well-known/x-nodeinfo2. That is my point here, to keep sent requests low.
I’m not an expert on the topic. However, as far as I understand it (I could be wrong though), the nodeinfo is about providing metadata about protocols a server is running in order for software to know how interact with it, and not nececessarly to provide information about specific software products running on that server.
Sounds almost like someone is trying to fly under the radar…
I just try to be nice to administrators.
Please take a look here:
You can see there the JSON element
software and then
name (my software ignores
version). That gives away enough info to know what software is running there.
As I said, I’m not an expert, but afaik nodeifo is only provided if an app is running on the server that uses something like Activity Pub, e.g when the Nextcloud Social app is installed. This would look as follows:
name "Nextcloud Social"
Yes, I know. Hmm. So I need to add
/status.php as possible URL for checking software name?
Okay, other proposal:
- People want to change the name ‘Nextcloud’ for whatever reason to their liking.
- Let there be two fields: One for showing on website and one for internal use only and that remains
So the internal one is set as
<meta property="og:platform" content="nextcloud" /> while the other can be changed and shown to users. Problem solved. My point here is, I need something that can be identified by software.
Another issue I have with
/status.php is that it isn’t a standard URL, like those under
/.well-known/ try to standardize such meta information (e.g. used software, version number, et cetera). And as I wrote earlier, I need to send another request to the other server. Also I need to add some code for only handling the JSON from
/status.php and other software might send other data here.
If there is a standardized way, then let’s do this, e.g.
og:platform is absolutely fine with me.
even though .well-known/nodeinfo only applies to fediverse servers and for exmple
cloud.teamtammo.com does not provide them, you can use the
.well-known mechanism to find out whether it is a Nextcloud server.
-request, even though the response code is 404, nextcloud servers will send a
-header. So you do not even have to download one single bit, you only have to perform a header-request:
:~$ curl -sI https://cloud.teamtammo.com/.well-known/you_name_it | grep nextcloud
You can now simply integrate that into your automation.
Hope that helps,
There isn’t the one standardized way how web applications, or applications in general, would or should identify themself. This depends on the type of application, the web standards or protocols they are using, and in many cases even on the specific app itself. The nodeinfo example is specific to ActivyPub, but there are tons of application specific APIs out there, and they don’t necessarly identify themslfs withe a big “Hello, I’m Nextcloud, please connect to me!”
On top of that, there are all the other internet protocols like imap, smtp, XMPP etc… etc… So, If you want to be a 100% sure what is running on a server, you would have to do a full portscan, and then, in the worst case, send out hundreds of queries in order to find out what application is running behind the open ports.
My software is freely available: git.mxchange.org Git - fba.git/summary
It collects blocklists from fediverse instances and makes them searchable. Nextcloud isn’t providing these features, e.g.
/api/v1/instance/<peers|domain_blocks> which is my main goal to fetch. Nextclooud is only half part of it. As it’s software name appears in some view. What I try here is not have so much different names there,
generator and also
og:site_name are already checked if auto-discovery through
Excerpt from the software:
Detection is done in following order:
/.well-known/nodeinfo was reachable and software type was found in nodeinfo response
STATIC_CHECK: Node information was found by probing for well-known URLs
PLATFORM: Meta data
og:platform was found in HTML code
GENERATOR: Meta data
generator was found in HTML code
SITE_NAME: Meta data
og:site_name was found in HTML code
None: the instance was not reachable or the used software was not stated
/.well-known/nodeinfo and then provided
href URLs are fetched, if that fails, “static checks” on “well-known” e.g.
/nodeinfo/2.1.json and so are checked, then next
<meta /> tag and last resort is
og:site_name. If all failes,
None is being set for software name.
They are all based on
GET requests, adding a
HEAD request would add more code, for now only for Nextcloud. Just adding a
og:platform (and maybe others but I won’t read and analyze them) is very little effort to do.
I only check Fediverse instances and their peers provided in both
domain_blocks JSON APIs. If they fail to be fetched (many Fediverse instances don’t provide them). Then I go with other ways, e.g. Lemmy provides
/instances which is a HTML response that I can extract peer names from.
For websites, there is a standardized way,
<meta name="generator" value="xxx" /> or said
og:platform are those ways.
Nextcloud isn’t primarly meant to be a Fediverse instance, and unless someone decides to install “Nextcloud Social”, in which case that instance will corretly identifiey itself as “Nextcloud Social”, it’s none of your software’s business, whether that instance is identifying itself properly, or if it identifies itself at all. I would even argue that most Nextcloud users have zero interest in connecting their Nextcloud instances to the Fediverse.
However, if you think that the Nextcloud Social app doesn’t properly follow the ActivyPub standards, I’d suggest you report it here: GitHub - nextcloud/social: 🎉 Social can be used for work, or to connect to the fediverse!
Thank you for the long reply. I only look to reduce these fancy names they enter to just one generalized name, e.g.
nextcloud. My software is not federating, not does it follow ActivityPub protocol.
Well, and I think you need to find another way to identify what’s going on on those servers. But maybe there is another uinque pattern you could query in order to find any Nextcloud instances, I’m not sure…
Anyways, to me this doesn’t sound like a feature (some might even call it an anti-feature ) the average Nextcloud user or even businesses would need, but rather like a very specific requirement on your part.
It is so easy as I described in my starting post, internal name is
nextcloud, the external part is shown e.g. as the link shows in footer: https://cloud.teamtammo.com/ There is no “special need” here, just that the used software is properly stated. For my software, I can just easily add a
<meta /> tag to my
I think the question is rather if it is necessary and/or wanted. But feel free to open a feature request on Github…
Yes, true. On one side Nextcloud itself isn’t federating without
Nextcloud Social being installed. That makes sense. On the other side, some “low-level” software propagation, e.g.
<meta name="generator" /> or the whole
og:* set isn’t much HTML code to be added and here I don’t have to add extra code only for
/status.php and risking that other websites may also provide it but with different intend. That is why
/.well-known/ was “invented” to have generic WWW-wide paths that are indeed well-known. It is similar with ports <=1024, they are well-known ports and services shouldn’t be other than default assumed, e.g. 25 is SMTP, 110 POP3 and so on.
A Nextcloud instance with Social App should provide you
_oc_config should be in the HTML response for almost every Nextcloud instance.