I see that there is already a lot of activity at “mediadata” app github repository but I will write down my thoughts anyway
So, we could have an underlying metadata handling system with following features:
system default metadata fields (creator, title, date created/modified …)
user defined metadata fields (color of the car, something stupid, something smart … )
system default metadata profiles (eg. text documents, photos …)
user defined metadata profiles (eg. scientific article)
metadata extraction and mapping
Maybe mapping is not necessary but if we want to have consistent search then search by “Creator” should find photos, documents and articles of that person. … In this example It means that we need a common “Author/Creator” field which could populate either from user input or could be extracted from photo IPTC or from MS Word document properties.
It’s the same with “Date created”, “Keywords”, “Summary/Description” etc etc.
and this can be achieved today by registering plugins with the commenting system. Read the comments, it’s not straight forward to build a universal model.
The GSoC project app is headless and is focused on extracting the information from headers, but the way we send the information back to clients is sort of universal. Plugins for comments could be written to enhance the files app by letting users add new fields, but don’t forget that NC 10 has a workflow engine with automatic tagging which could be seen as some sort of metadata already.
I think what would help are some use cases and some functional requirements if you can.
Hi, does anybody know whether there was any progress regarding editing and searching of file meta data?
We’d like to be able to tag files (with another tag system than the collaborative tags) and search for these tags - and the search should allow the user to “OR” combine multiple search expressions, like article numbers, which are not contained in the file name but in the meta tag, and get a complete list of results with all articles that match one of the article numbers / search expressions.
So any solution for this would be appreciated. Thanks in advance,
Did someone every done this? It’s still extremly useful (especialy for pictures) to be able to search for exif meta-data. it’s somehow a bit of a shame that a simple exif search is not available in Nextcloud (gallery).
The user should be able to define “document types” or “aspects”, such as invoice, purchase order, CV, etc.
For such document types, a set of custom meta data fields can be defined.
Each meta data field can be of a specific type, i.e. string, date, timestamp, integer, decimal, etc.
Advanced: meta data fields can be singular or plural
The custom meta data fields should appear in the UI (document detail view) and should be editable with appropriate UI widgets (where the user has permissions)
document types can be added and dropped for a document
multiple document types can be applied to a document
if a document type is removed from a document, then the meta data fields should be removed from the document as well
technically: each document type (internally) uses a specific prefix, similar to a namespace, for its meta data fields, so that there are no conflicts. A meta data field has a unique name (for example: invoice:customer_no, cv:candidate_no) and a (localizable) label.
meta data fields should be accessible via REST API as well
meta data fields / document types can be assigned to files as well as folders
Accounting: for invoice document, you would be able to store invoice date, amount, customer id, etc. Maybe even automatically extracted via OCR. (For a similar usecase, see fileee.com application)
Headhunting firms: for CV documents, you would be able to store candidate ID, notes, availability date, etc.
any custom workflow or even external workflow engines (Camunda, Flowable, IFTTT, Zapier, etc.) would be able to store information in these fields, if not in their own.
And as mentioned in my previous comment:
Would be great if some standards would be taken into consideration as well in such implementation, i.e. Dublin core.
I think some good conceptual ideas could be derived from the approaches that ECM tools such as Alfresco or Nuxeo are taking when handling metadata by applying document types / content models and aspects.
Anybody working on this?
Why would you work out something from scratch? There is plenty of free/shareware metadata editors around. Why not integrate one which is as generic as possible? Running from web browser, producing XML files and eating XML schemes.
For research data management, there is a Zotero, perhaps an option to collaborate with them and provide an app for this in Nextcloud (which also connects to their desktop/mobile clients). Problem is that they want to earn money with server subscriptions…
mathiasconradt‘s functional spec (Oct’17) is exactly what I’m looking for in order to be able to use next cloud as a dynamic information model for all my files. I want to get away from static / inflexible folder structures, which I view as the only thing holding me back from using next cloud.
There is a proprietary solution called M-Files which has a really nice implementation of this, which goes on to leverage external services for content analysis. E.g Auto-suggestion of metadata based on file content.
It would be great, say, if the next cloud files module were to be renamed “folders”, and a new module could be added called “objects”. The former essentially presenting the existing static folder-structure view, the latter presenting a dynamic view which would be user-malleable to fit their use-case / queries.
I was thinking all that might be required for this would be to create a meta-data directory based on an appropriate existing adaptive object model pattern / framework, which would be leveraged as an index to the files stored in next cloud (I assume each has some form of referencable primary key?).
In terms of GUI, M-Files tries to look like a folder structure, which I don’t think is necessary. I think having a single data table displaying everything (limits / pagination notwithstanding), where you can add / remove the metadata columns you’re interested in, and then sort / filter them with excel/access style quick filters, would result in a very simple yet powerful querying mechanism that MS Office users should already be pretty familiar with.
Further bells & whistles could be incorporated to give some high-level homing in on relevant object sets in the initial view. E.g. You might have a small number of high-level / common metadata types such as bucket/project and documentType. So you could quickly round down to, say, “all the invoices under project 1”. Then once you’re looking at that dataset, you might do further rounding down by more specific metadata.
They key thing would be in being able to create pretty extensive sets of metadata profiles (type & property sets).