Nextcloud Ethical AI rating: A transparent approach to privacy-first AI

Originally published at: Nextcloud Ethical AI rating: A transparent approach to privacy-first AI - Nextcloud

The adoption of AI is growing rapidly, but so are concerns around data privacy, bias, and control. That’s because many AI systems today are developed and operated in centralized environments, where transparency into training data, model behavior, and data usage is limited.

With new regulations, like the EU AI Act, organizations are more and more under pressure to adopt AI safely. So, how can you ensure that your organization can integrate AI into its everyday workflows while also maintaining full data control and transparency?

At Nextcloud, we have asked ourselves the same question, leading us to the development of the Ethical AI Rating: a simple way to evaluate how transparent, open, and trustworthy an AI solution really is.

The problem with Big Tech AI platforms, as well as the challenges of open source AI technologies

The development of AI is moving fast, and many of the new capabilities face ethical and even legal challenges. As we explored in our article on Big Tech AI privacy concerns, many of these tools rely on large-scale data collection, often without clear transparency or user control.

This can lead to issues with:

  • Use of data without permission
  • Discrimination and biases
  • Data theft and leakage

What’s more, the mere use of open source code is no longer enough to be able to say you are in control over your data or that the software is safe or ethical.

This is particularly true for neural network-based AI technologies.

The set of data and the software used in the training, as well as the availability of the final model, are all factors that determine the amount of freedom and control a user has.

Your guide to privacy-first AI: Nextcloud’s Ethical AI Rating

Not all AI models are equal, because some prioritize openness and user control, while others rely on opaque models and centralized data processing.

To help users and administrators make informed decisions, we developed the Nextcloud Ethical AI Rating: a rating system designed to give a quick insight into the ethical implications of a particular integration of AI in Nextcloud.

This rating aims to provide a quick, transparent review of how much control and insight you have over these tools.

This is especially important as organizations evaluate AI tools not just for performance, but for compliance, privacy, and long-term data sovereignty.

Users can still evaluate solutions in more detail, but the Nextcloud Ethical AI rating can simplify the choice for the majority of users and customers.

The Nextcloud Ethical AI rating in practice: What you need to know

The rating is based on points from these factors:

Open source licensing

Is the software open source for both inference and training?

Self-hosting options

Is the trained model freely available for self-hosting?

Availability of training data

Is the training data available and free to use?

And has four levels:

Red đź”´

Orange đźź 

Yellow 🟡

Green 🟢

This leads us to the following ranking system:

  • If all of these conditions are met, we give the AI solution a green label 🟢
  • If one condition is met, it receives an orange label đźź 
  • If two conditions are met, the label is yellow 🟡
  • If none of the conditions are met, it gets a red label đź”´

In other words, if you have full control over the AI tool you’re using, you’ll see a green label. If you have no control and a lot of dependency, the label will be red.

These colors give an immediate overview of the AI solution for factors such as sovereignty, transparency, and data control.

Caveat: Why critical thinking is still important to ensure ethical AI

We add one additional note to the rating: bias.

Bias remains a known challenge in AI systems. While it is difficult to guarantee complete neutrality, the rating highlights known issues where they exist, helping users make informed decisions.

So when we discover major biases in the data set or in the expression of the model at the time of our last check, you will see this mentioned in the rating. This includes discrimination on the basis of race or gender for a face recognition technology, for example.

There are other ethical considerations for AI, of course.

Think of legal challenges related to the use of datasets, in particular copyright issues, as well as the significant energy consumption of deep neural networks, which is of great concern.

Unfortunately, those concerns are extremely hard to quantify in an objective manner, and while we intend to try to warn users of any open issues, we can not (yet) include them in our rating.

For that reason, we recommend users to investigate for themselves what the consequences of the use of AI are for their individual case using the Nextcloud Ethical AI Rating.

Ethical AI in Nextcloud Assistant

Of course, we try to practice what we preach: this approach is not just theoretical, but also shapes how AI is implemented in Nextcloud.

At the core of Nextcloud’s AI approach is the principle that it should never be locked into any single provider. In other words, the administrators can choose between different providers, including self-hosted options.

What’s more, organizations can decide where their models run, which models are used, and what happens to their data.

By doing so, your organization can benefit from AI-supported collaboration without giving up responsibility for its data and stay in control.

Our latest release of Nextcloud Hub 26 Winter continues to build on this foundation.

AI is still optional and configurable, instead of a mandatory layer imposed on all users or workflows. This allows organizations to adopt AI at their own pace, align it with internal policies, and decide which use cases make sense in their environment.

When AI is enabled, it becomes part of the collaboration environment instead of an external dependency. It integrates into existing workflows without breaking governance or compliance frameworks.

What can the Nextcloud Assistant do for you in practice?

  • Improve and generate texts, media, and documents
  • Answer questions based on organizational data
  • Summarize meetings and conversations in Nextcloud Talk
  • Provide live transcription and translation for multilingual collaboration
  • Integrate AI capabilities directly into email, chat, meetings, and file workflows

Nextcloud Hub 26 Winter also makes compliance easier: You can generate images and documents in various apps and automatically label content with watermarks. This ensures your organization is in line with the latest regulations, such as the EU AI Act.

In short: privacy-first AI solutions such as the Nextcloud Assistant give organizations the efficiency and convenience of AI, while keeping governance, compliance, and data ownership exactly where they belong: under their control.

Regain your digital autonomy with Nextcloud Hub 26 Winter

Our latest release of Nextcloud Hub 26 Winter is here! Discover the latest Nextcloud features.

5 Likes

Now I’m surely not the only one wanting to know which AI services you have rated so far, and what their rating is. :thinking:

OK, I see the OpenAI integration rated as “Neutral”/yellow. But still: is there a rating list somewhere?

Btw., which point is not met by OpenAI?

That’s the wrong rating @jondo , the rating you refer to is only a user rating, and does not have anything to do with the ethical AI rating. I’m sorry for the confusion.

you can watch our release presentation back soon and there you can see the ratings.

In the future we will also publish this in the app store.

Please further clarify what you mean by training data as “free to use” so it is not misinterpreted by definition. Thanks.

That green looks very yellow to me

1 Like

What would you suggest to add for your green?

I guess he refers to the colour of the icon. “Green :blush:” also looks yellow to me.

I wonder if this gimmick has any negative impact on the speed of Nextclouds more important functions of file handling - synch upload download? Can it be turned off? Or removed completely?

All of it is extra apps do you don’t have to enable it. None are on by default.

1 Like

While I agree in principle, just to point out how difficult such a rating would be, e.g. so called “discrimination” in face recognition technology.

There’s the physics of photography, i.e, LIGHT, and less light, less contrast, the more difficult to recognize things. That’s why night vision is more difficult than day time vision.

So, if a face recognition algorithm is less accurate with dark skinned people, it’s not “discrimination”, or at least not a-priori, but simply the limitations of physics: lower contrast pictures lead to lower accuracy.

As a matter of fact, it would be surprising if there were NO difference in accuracy.

So, if activists yell about supposed discrimination, it warrants extreme caution before such a claim is taken up and incorporated into a rating, because one really needs to break down the actual factors of how a particular discrepancy comes to be.

Discrimination requires INTENT, or at least willful neglect, it’s not the side effect of physics.
If there’s actual intentional discrimination, then that, of course, should be flagged, and not just flagged, but condemned. But people are too quick to jump to conclusions when something deviates from their utopia of equal outcomes despite unequal inputs.

They already anticipated this in the sentence you quoted.

As it is impractical to prove there is no bias, we merely point out if, at time of our last check, major biases (like discrimination on race or gender for a face recognition technology for example) were discovered in the data set or in the expression of the model.

A few more general thoughts on the subject:

The biggest problem with all these discussions about social justice, political correctness, privacy etc, in combination with A.I., from conservative latent racist right wing people all the way over to leftist social justice warriors, is that everyone always brings up a subset of a more general issue and then propose measures that only address that specific subset. The bigger picture is usually ignored.

Technology, just like we human beeings, is pretty much what it is, an it will never be perfect. And while I’m not saying it should left completely unregulated, neither ratings nor bans will ever change that fact.

What is important is how we as a society use this technology. The most important thing for me is, that institutions, such as politics, law enforcement, and especially the judiciary, must never make definitive decisions based solely on what the technology says. Technology can be a tool, but it must always be manually cross-checked before drawing final conclusions or making final decisions. A.I. should never be the sole basis of evidence.

But I don’t see how any of this would be an issue on your personal Nextcloud instance, where things are hosted locally, and then maybe a few dark-skinned people are going to be detected wrong on your personal holiday pictures.

I mean don’t get me wrong, this should definitely be improved and it will over time, but I think in this case it is enough if Nextcloud provides general ratings and guidelines so that you can roughly classify the tools. The rest is your responsibility. You can blindly trust the ratings and the A.I., or you can continue to use your eyes and simply see the A.I. as an additional tool. :slight_smile:

2 Likes

As an open source enthusiast, I find that rating to be naive.

First: Stable Diffusion is open source, and brings many concerns with it.
It steals artists work, has bias and can lead to other moral complications.

Second: OpenAI may be mostly proprietary, but their leaders do show more awareness and consideration to their actions, than I know it from most people in the industry - and that counts for the entirety of computing, not just AI.

Âą stable diffusion ethical criticism at DuckDuckGo
² Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI | Lex Fridman Podcast #367 - YouTube

2 Likes

I too am not keen on normalising the inclusion of AI into NC.

Currently:

  • :green_circle: all AI integrations is in separate apps that can be disabled, or turn-off-able features within apps.
  • :green_circle: off by default.
  • :green_circle: NC is not scraping your private data and sending it to OpenAI or any other godforsaken data leach.
  • :orange_circle: NC is pushing the AI marketing messages that AI is super fun and helpful, normalising its inclusion in everyday products. Is this really what people want or need from NC?
  • :red_circle: NC’s ethical scoring is based on the idea that it’s too hard to include things like: was this tool based on stolen data? private medical records? prejudice?
  • :red_circle: NC’s ethical criteria omit their own use cases. How could the AI features included in NC apps result in unethical behaviour? I’m quite happy to believe that everyone at NC is a lovely person and has no conscious evil intent. But what of the unintended consequences?

Naivete abounds in this field. The idea that as humans we can be faced with what looks like useful information, but then diligently check its every assertion or implication - it just would be too much work, we’re lazy, that’s why we think AI is helpful or convenient: we don’t want to do the work ourselves, that’s why we reach for such tools.

e.g. LinkedIn favouring men in searches did Li devs intend for that to happen, I’d hope not. Did it silently continue the patriarchy? Yes, of course.

e.g. race discrimination can occur when a system - for whatever reason (yes, including light and physics :weary: as well as prejudiced training data) - does not produce the same quality of results for different people, and those results are then used for [assisting in] decision making: which they will, of course, be.

Some NC use examples:

  • Smart Inbox / related. So staffer 1 gets into a confidential disciplinary situation at work. Uh oh. But wait, the AI realises that so has staffer 2 - going on the type of emails and messages, perhaps staffer 1 would like to share with staffer 2? Oh wait, HR was supposed to be confidential?

  • Face recognition in photos. Sure, useful if it doesn’t matter to you that it works better on white folks. It’s like a microaggression: oh it never works for X, so I have to manually identify them. Oh no, now I have a lot of photos of an event with X in them and I have to do all this work for X. I much prefer the events where X isn’t present as I have less work to do.

  • image generation by prompt [from stolen art], yep that was #1 on my list of things I needed to achieve today. ChatGPT text generation: I don’t need convincing BS that’s takes me ages to figure out the problems with. These things just don’t need to be features: if people want to use those they can go use them; they don’t need it inside NC.

I really appreciate that the NC team are doing their best to tag along with the AI bluster and be competitive with other systems whose marketing depts are all over AI, and it sounds like they genuinely are looking for ways to do this privately and ethically. Respect that intent. But I’m also saddened that it’s just increasing the FOMO and unhelpful/unethical “AI is just fun and helpful” message that is, further consolidating power and wealth in the hands of the very few, with a complete lack of accountability.

3 Likes

I get your point, but did you actually read how they use different criteria. The cod being open is only one of the three domains that get weighed. The other two are:
:white_check_mark: Is the trained model freely available for self-hosting?
:white_check_mark: Is the training data available and free to use?

On these two domains, you’ll notice that Stable Diffusion is lacking. The training dataset is not available, nor free to use.

1 Like