I see that there is already a lot of activity at âmediadataâ app github repository but I will write down my thoughts anyway
So, we could have an underlying metadata handling system with following features:
system default metadata fields (creator, title, date created/modified âŠ)
user defined metadata fields (color of the car, something stupid, something smart ⊠)
system default metadata profiles (eg. text documents, photos âŠ)
user defined metadata profiles (eg. scientific article)
metadata extraction and mapping
Maybe mapping is not necessary but if we want to have consistent search then search by âCreatorâ should find photos, documents and articles of that person. ⊠In this example It means that we need a common âAuthor/Creatorâ field which could populate either from user input or could be extracted from photo IPTC or from MS Word document properties.
Itâs the same with âDate createdâ, âKeywordsâ, âSummary/Descriptionâ etc etc.
and this can be achieved today by registering plugins with the commenting system. Read the comments, itâs not straight forward to build a universal model.
The GSoC project app is headless and is focused on extracting the information from headers, but the way we send the information back to clients is sort of universal. Plugins for comments could be written to enhance the files app by letting users add new fields, but donât forget that NC 10 has a workflow engine with automatic tagging which could be seen as some sort of metadata already.
I think what would help are some use cases and some functional requirements if you can.
Hi, does anybody know whether there was any progress regarding editing and searching of file meta data?
Weâd like to be able to tag files (with another tag system than the collaborative tags) and search for these tags - and the search should allow the user to âORâ combine multiple search expressions, like article numbers, which are not contained in the file name but in the meta tag, and get a complete list of results with all articles that match one of the article numbers / search expressions.
So any solution for this would be appreciated. Thanks in advance,
~YH
Generally I am asking for both of the above, although extended metadata would be more like a workaround for the lacking possibility to search system tags with an advanced query system.
Generally the system tag system would be preferred. but I couldnât find any solution thatâs already existing. Is there maybe something I havenât found yet you could point out to me?
although extended metadata would be more like a workaround for the lacking possibility to search system tags with an advanced query system
I do not agree: extended metadata would be really useful e.g. for EXIF data or for bibliographic data.
I think that a tag system (even an advanced one) would be really unfit to manage that kind of information.
In any case I proposed an improved search for tags here:
It was something I was thinking from some time and your note gave me the momentum to write it down. In case you are interested, just comment or upvote.
For the extended metadata, I imagine that it is really more complex.
@alenkovich Did you ever progress this extended meta data app? I am also looking to do something similar and curious to see how you got on if you progressed this.
Thanks mate
Did someone every done this? Itâs still extremly useful (especialy for pictures) to be able to search for exif meta-data. itâs somehow a bit of a shame that a simple exif search is not available in Nextcloud (gallery).
I think itâs a great idea and definitely a useful feature for many use cases. I would actually consider this a killer feature.
Would be great if some standards would be taken into consideration as well in such implementation, i.e. Dublin core.
I think some good conceptual ideas could be derived from the approaches that ECM tools such as Alfresco are taking when handling metadata by applying document types / content models and aspects.
Custom metadata is also a good way to store workflow related information.
The user should be able to define âdocument typesâ or âaspectsâ, such as invoice, purchase order, CV, etc.
For such document types, a set of custom meta data fields can be defined.
Each meta data field can be of a specific type, i.e. string, date, timestamp, integer, decimal, etc.
Advanced: meta data fields can be singular or plural
The custom meta data fields should appear in the UI (document detail view) and should be editable with appropriate UI widgets (where the user has permissions)
document types can be added and dropped for a document
multiple document types can be applied to a document
if a document type is removed from a document, then the meta data fields should be removed from the document as well
technically: each document type (internally) uses a specific prefix, similar to a namespace, for its meta data fields, so that there are no conflicts. A meta data field has a unique name (for example: invoice:customer_no, cv:candidate_no) and a (localizable) label.
meta data fields should be accessible via REST API as well
meta data fields / document types can be assigned to files as well as folders
Use cases:
Accounting: for invoice document, you would be able to store invoice date, amount, customer id, etc. Maybe even automatically extracted via OCR. (For a similar usecase, see fileee.com application)
Headhunting firms: for CV documents, you would be able to store candidate ID, notes, availability date, etc.
any custom workflow or even external workflow engines (Camunda, Flowable, IFTTT, Zapier, etc.) would be able to store information in these fields, if not in their own.
And as mentioned in my previous comment:
Would be great if some standards would be taken into consideration as well in such implementation, i.e. Dublin core.
I think some good conceptual ideas could be derived from the approaches that ECM tools such as Alfresco or Nuxeo are taking when handling metadata by applying document types / content models and aspects.
Anybody working on this?
Why would you work out something from scratch? There is plenty of free/shareware metadata editors around. Why not integrate one which is as generic as possible? Running from web browser, producing XML files and eating XML schemes.
Yes absolutely! That would be fantastic!!! Creating custom metadata fields for files would be very useful. I would love to replace fileee, which Next cloud + this functionality!
For research data management, there is a Zotero, perhaps an option to collaborate with them and provide an app for this in Nextcloud (which also connects to their desktop/mobile clients). Problem is that they want to earn money with server subscriptionsâŠ
ref:
mathiasconradtâs functional spec (Octâ17) is exactly what Iâm looking for in order to be able to use next cloud as a dynamic information model for all my files. I want to get away from static / inflexible folder structures, which I view as the only thing holding me back from using next cloud.
There is a proprietary solution called M-Files which has a really nice implementation of this, which goes on to leverage external services for content analysis. E.g Auto-suggestion of metadata based on file content.
It would be great, say, if the next cloud files module were to be renamed âfoldersâ, and a new module could be added called âobjectsâ. The former essentially presenting the existing static folder-structure view, the latter presenting a dynamic view which would be user-malleable to fit their use-case / queries.
I was thinking all that might be required for this would be to create a meta-data directory based on an appropriate existing adaptive object model pattern / framework, which would be leveraged as an index to the files stored in next cloud (I assume each has some form of referencable primary key?).
In terms of GUI, M-Files tries to look like a folder structure, which I donât think is necessary. I think having a single data table displaying everything (limits / pagination notwithstanding), where you can add / remove the metadata columns youâre interested in, and then sort / filter them with excel/access style quick filters, would result in a very simple yet powerful querying mechanism that MS Office users should already be pretty familiar with.
Further bells & whistles could be incorporated to give some high-level homing in on relevant object sets in the initial view. E.g. You might have a small number of high-level / common metadata types such as bucket/project and documentType. So you could quickly round down to, say, âall the invoices under project 1â. Then once youâre looking at that dataset, you might do further rounding down by more specific metadata.
They key thing would be in being able to create pretty extensive sets of metadata profiles (type & property sets).