Generic domain for app development

Hello,

I wanted to share an idea/oder with you and wanted to know your opinion on this.

There might be the need for an official domain name for apps. This could be a page with the documentation, some links for anonymous telemetry or other web endpoints that allow for additional functionality.

Example: In the cookbook app we were discussing to implement an AI to detect grammatical parts of ingredients. We would need a way to train and enhance the AI model also for foreign languages. The most efficient way would be to let the users correct the grammatical errors and pass the corrected results to a central location where the data is stored and an AI can be trained for the next release. You get the idea?

Another example would be a monitor, which versions of the app is deployed. Can a certain change be made that is incompatible with PHP x.y without losing too many users?

Therefore, i registered the Domains nc-apps.org and nextcloudapps.org. There is not yet any content there. The idea is to allow the maintainers of the individual apps to register a subdomain (like cookbook.nc-apps.org) where they can host or link additional content. The main site (and the www part) can contain generic information what the die is, how to register an app etc.

I would just offer the domain. No web space or other services would be provided. At most an http redirect to another place (like documentation) would be considered. The maintainers are responsible that the service is not abused.

What would you think of this? Especially, I would like to ask @Daphne if this could be used to enhance the app development experience.

Cheers
Christian

1 Like

Hello Christian! :wave: thank you for your elaborate post and your initiative :pray:

I think it’s a great idea. We could simply move forward if you have some documentation somewhere (could be just a forum page even) on how app developers can use the domains and how to contact you, then I can link it on Developer - Nextcloud so people can find it.

One consideration is that I’m not sure the idea of gathering usage statistics is really resonating fully with notions of privacy. I am a little concerned that gathering user data might come at a cost. I’m not a developer myself and have no development background so feel free to push back on me here, but if we ponder for a moment about the ‘why’ of gathering user statistics, we could also perhaps put this in a different format. For example, a year ago we did a community survey where we asked a lot of technical questions too (e.g. which version of Nextcloud they use, which apps, etc.) and we could use this dataset or approach to get answers to your questions too. I will link to the results of this survey soon. Thus I would also support an initiative to do a new community survey. Developers can then collect all the questions they have, and we send it out. I will only need some help with the analysis as last time we got over a 1000 replies.

Do you have feedback?

2 Likes

So I just checked the results that were listed of the community survey of last year. It does include interesting information about architecture, use case, and nextcloud version in use, but not about php…

I will make sure this gets linked on Develop for Nextcloud: App development tutorials too.

Of course this will not solve the issue of the AI you mentioned. If we allow for such central data collection, let’s at least try to encourage really consent / good opt in standards that are not a take-it-or-leave-it choice.

Bottom line I still think your initiative makes sense.

Just suggesting a path forward here to be constructive, but you can ignore all if you think it makes no sense:

Write a documentation page introducing the domain, what it’s for, and how to use it. Can be a forum thread.
Make a decision about using it for usage statistics: should we allow data collection about infrastructure, or should this be put in a more central, more explicit opt-in, questionnaire
Make a decision about other fair use standards, e.g. regarding data collection needed to train an AI model.
Update the documentation with those decisions
Ask Daphne to link it on Develop for Nextcloud: App development tutorials

2 Likes

Hello,

I was thinking in a different direction. I am thinking of creating a very simple app (telemetry), that will centrally collect this data. The admin can decide by installing or uninstalling if telemetry data should be shared in general. Additionally, one could register a setting in the admin settings that lets the admins configure this even finer.

Currently, I know one app, that is doing something like this yet. See this overview. That could be extended and centrally registered (in one central app) that can be promoted and checked for security and privacy.

I am thinking that it might be feasible to allow all installed apps to register their own data source. For example, the password app could ask what Password Check Service (just an example from the site above) is used, which is a piece of app-specific information. This could be done via custom hooks/OCP events or by defining an interface and allowing the individual apps to register these services (similar to the notification system).
In hindsight, I assume that OCP event might be better suited but this can be discussed.

That way, one can offer all app devs an option to collect custom information on top of the basic information from the core telemetry app. The admin might then for example enable or disable the app-custom collection globally in the settings.

Every app developer can register a background job and send some data “home”. No one can stop them. Good faith and the open-source community might step up in such a case and react (whatever is appropriate) but this would avoid all devs having to do this kind of stuff. Plus, the backend can be made open source and a community project as well.

Of course, this needs documentation, this is clear. I want to make sure the app would be useful before investing too much time and thoughts into such a thing.

What would you think of such an approach?

1 Like

I suspect this is covered by the app guidelines in general. If data is extracted, this should be made clear and officially requested of the user. This domain thing will at most add a more officially-looking name tag to it.

Apart from that, I consent with your steps. I would like to give you a preliminary version to read if you do not mind, @Daphne. I hope, this will allow us to get this running with everyone consenting.

1 Like

Have you considered asking Nextcloud about using a wildcard subdomain such as *.apps.nextcloud.com or *.app.nextcloud.com

If possible, this would be the proper way to set this up. All respect to your initiative, but using the existing url would be ideal for multiple reasons.

I want to politely discourage these. Why?

  • Single point of failure, because an individual not from the company owns them.
    • If you disappear or stop renewing your domains (which could totally happen) these disappear.
    • Let’s say either of these sites do take off… excellent, but some disagreements arise. Since you privately own these domains there is nothing anyone can do to take ownership away from you, creating confusion at the URL and search result level.
    • When Nextcloud asks, will you turn over ownership? (not as simple a choice as it might seem)

I only say this because I’ve been through this process and my takeaway was…

If you accomplish the same thing using the existing url of *subdomain.nextcloud.com definitely do this! Maybe it takes some negotiation, but you know that domain isn’t going anywhere and you can still give yourself the access needed to code up some solutions.

If *.app.nextcloud.com is not possible, only then proceed with using nc-apps.org and nextcloudapps.org. My suggestions are:

  • Please immediately turn over ownership of the domains to Nextcloud.com
  • Only use one domain, because two is confusing.
  • If you are not willing to turn over the ownership from the beginning (with the understanding you will have DNS access for your subdomains) I do not recommend proceeding with this at all.

:heart:

2 Likes

Good points @just , I had not thought of those. Let me check internally.

I just chatted about this with @Andy (director of engineering) to see what his considerations are. In summary:

  • I agreed with the points of just, but I also agree to the needs and wishing of Christian

  • As for the data collection I think this would need to come from us as a platform not via an app, would need to be very limited and with privacy built-in. So some aspects here we might just need to be more transparent since we do have an opt-in (or out, don’t know) data collection and maybe some aspects can be handled also via the existing app store which we might just need to extend.

  • The AI part, also an issue since I wouldn’t want a central solution here because that would be the opposite of what Nextcloud as a project tries to do - decentralization, so the solution would rather be federation here I guess.

  • The telemetry app that then can be used by other apps I have to say I strongly advice against this, inter app dependency is nothing the platform supports today and will cause issues.

Given that this thread is gaining attention, let’s wait for one or two more days to see where this discussion leads us and then see how we can take it from there.

I have not. This was due to the fact, that the code of conduct states that no app developer might imply that he is directly related to the GmbH. Thus, I considered it a good idea to keep the community projects strictly apart. No offence taken.

1 Like

It sounds like you recognize that this is a frightening prospect. I for one would never allow such an app to be installed or enabled on my system.

Most users arrive at Nextcloud to be free of this sort of data collection. It’s all well and good to make it opt-in, but just seeing the checkbox and knowing there is a bunch of phone-home code just waiting to be turned on (or abused by an attacker) can erode trust in the platform. Even more so were it under the control of an outside entity as @just mentioned.

And there is also the aspect of, if you start using this for compatibility checking such as with PHP, maybe users will feel pressured to opt-in to be represented so their system won’t be accidentally broken by the developers.

Useful things are quite often also dangerous.

3 Likes

+1
This should be a key consideration with this project. I still think ‘just’ for gathering infrastructure info we can do with a questionnaire.

1 Like

Related to the DNS handing over: I would hand it over if the GmbH asks for it. Which domain to use is mostly up to you. I have no favorite here. The other domain I would let die after the period of renting.

Related to the handling of the content, the juristic responsibility, and the financial organization needs discussion. If I hand it over, I want to leave for sure the juristic responsibility.

I know that it might be a frightening prospect. I just brought it to the table and the fact that every app can do this on its own. My suggestion just evens out the chances for smaller apps to use the functionality while not opening any additional security holes.

What would you prefer, having an opt-in or just the code hidden and fixed in the app code? I’d suspect it’s better to have them explicit (personal opinion). It must be clearly stated what is transmitted and ideally, it can be reviewed.

Yes, that is true and legit. If you change the view a bit the same argument can be used for the telemetry data: I try to support older versions of PHP than absolutely required. In a few days, PHP 7.4 will be EOL. Still, I suspect many installations are using it. If I knew, that 10% of the users are still on 7.4, I would try to keep it compatible with it.
Of course, this will put some sort of pressure on the admins but this is good to avoid EOL software. In the cookbook, we had an admin that wanted to use 7.0 (?) a year ago. It is incompatible but also EOL for years.

I have read suggestions here in the forum to just go for the most up-to-date features (newest software) and only backport security fixes. This would put even more pressure on the admin to get onto the latest version.

That is true. See above.

There are tendencies to allow this (see notification app and also the OCS API.

I have multiple issues with the cookbook app, that aim for inter-app integration. Is this discouraged in general? Then, it should be stated clearly (I would close the issues referring the statement).

A questionnaire is a single point in time. This is useful for bigger decisions but falls short if you need current information in a quick way.


I see your points. I recoil from implementing such a feature into the cookbook app. I fear this might help with the general development process but I fear the risks involved. I thought a central solution might be better in different ways and opened this topic.

1 Like

One thing I forgot: I have not defined a security model yet. This needs to be done first, for sure. Here, there should be close integration with the core and to hear the aspects of the community.

1 Like

This triggers my curiosity. would you mind explaining me in a bit more detail, with some examples, what you mean and where a questionnaire would fall short for you, with examples that can be understood by a non-coder? :slight_smile:

1 Like

To get a tendency over time, you have to evaluate it over time. Example:

A questionnaire takes a few days/weeks and not every user is willing to fill it out every month. You see, there is still NC 20/21 out there running.

oh I can guarantee you there is still Nextcloud 15 out there and running :confused:

But, this is not really an answer for me. What kind of concrete development question do you have to answer with this timelapse data?

A pretty bold statement. New code is always additional security holes, potentially if not in reality.

Would I rather opt-in or have app developers secretly gather the data without my consent? I would rather not have it at all.

I will gladly take an occasional brief questionnaire for a product I use a lot or just have something to say about.

On the other hand, I almost always disable usage data collection. The concept of some person or AI watching and analyzing me makes my skin crawl. A lot of unnecessary data is also typically gathered, and often with personal data leaks. Read a dxdiag sometime and you’ll see what I mean.

With that in mind, any data you take from this will be far from complete.

3 Likes

Sorry, I had to let cool down the discussion a bit.

I cannot give you hard statements on what all app developers might want to know. I think this topic is too specific. I can give you examples and try to explain them.

  • The password app wants to know what ciphers are available on the servers. Most admins will have trouble answering this straight without support on how to acquire the information.
  • The password app collects statistics on the user configuration what password service is used (like have I been pawnd).
  • Recently I dropped support for NC 19 and 20 on the cookbook app. This was a requirement to drop some old code. I wanted to remove support for 21/22 as well. How many users would I lose?
  • How many users installed the beta version? Read: Does the low number of issues indicate a stable beta version or just the fact that no one installed the beta release?
  • What is the DB distribution? If I accidentally added a regression for PostgreSQL what are the chances these are detected by beta testers?
  • How much time does reindexing all recipes take in average/max (this is done on regular intervals, currently)? Is this a problem that needs addressing? Are we slowing down Nextcloud significantly?
    How many recipes does an average/median user have? 10? 100? 1000?
  • What apps are installed? Especially: Is the cookbook put in a group folder currently? → Number of users
  • What is the highest index of a recipe in the DB? Do we need to update the data type as we approach an overflow on some instances (significantly miss-guessed)?
  • How many of the requests are using the web frontend and how many are using the REST-API (aka are 3rd party services)?
  • Especially with multiple versions of REST-APIs present: When can an old API be deprecated/removed without breaking too many instances?

These are just some questions, that I can think of. Some of them might be simple lack of imagination (how many recipes could a user have max? → smallint, int, or bigint?). Others are more related to the optimization of the app for the most common users.

These questions are of course no show stoppers. We can develop without the answers to the questions. It just aims to make development and especially development decisions simpler and better understood.

Quite some of these questions could be answered in a questionnaire. If we are talking of things the user can answer, this might be feasible. Evaluating the computing speed of the reindexing is something that might not be in the user’s hands.

One could also consider letting the admin download a JSON with all relevant information (collected from the server but not sent home). Then, the admin can upload this information manually in a browser to a website that will parse/process the data. That way, the admin can decide if he wants to participate and no code to call home is in the server.

3 Likes

I had quite the same questions during the development of an app - so I fully support your idea.

Currently, users can volunteer to send some data through the survey_client, which collects monthly data. Maybe the app could open-source its results and provide an interface to add app-specific data.
Via this app, the users can decide in one place and with one click, whether they would like to participate and which data would like to spend.

3 Likes