Configuring context_chat_backend to use OpenAI text-embedding-3-small for building the index

I’ve successfully set up the Nextcloud AI Assistant with OpenAI API, and everything is are working correctly. However, I’m encountering challenges with the context chat feature.

After reviewing the documentation and source code for both context_chat and context_chat_backend components, I’m still unclear about options for the backend.

I have limited experience with LangChain, vector databases, and text embeddings which is making it difficult for me to parse the code.

However, from my research, it appears this requires a server with an NVIDIA GPU and minimum 6GB VRAM for local embedding processing, with no option to use an external service instead.

I understand why beefy NVidia hardware is needed to run it locally, however, if this is the only option it seems inconsistent with other Nextcloud AI applications, which all support using a range of external services.

Question: Is it possible to configure the context chat indexing to use OpenAI’s text-embedding-3-small model rather than using using local GPU hardware? If so, could someone point me to documentation or configuration examples?

2 Likes

I was just about to write pretty much the same as you have. Instead of OpenAI, I want to use local running models (on a mac studio in my homelab), while my nextcloud runs on a VPS.

From my understanding the whole reason of using this context_chat_backend container is for the embed vectors to follow the access rights of the original resources (e.g. files), something most embeddings databases (including local LLMs) don’t support out of the box. TBH most local LLMs don’t even have persistent embeddings storage out of the box…

Thankfully esobold (a koboldcpp fork), which I am using instead of LocalAI, has a persistent embeddings storage and a character’s API I can use to fake nextcloud user rights.

At this point I have the following options:

  1. migrate to a x86-64 + Nvidia GPU server for Nextcloud
  2. repackage the context_chat_backend for aarch64 and metal, and fork context_chat to add a custom backend url
  3. fork context_chat to add a custom backend url and work with esobold’s API
1 Like

This would be a good solution for me as well, I have a Nvidia GPU on my own computer while Nextcloud is running on a Hetzner VPS.

I’ve been using Nextcloud for a couple of months now and I don’t think this forum is very active - I get responses to basic hosting questions but not much else. Maybe we should post questions like this on Github instead?

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.