Files extended metadata

Ok, I got the point :slight_smile:

I see that there is already a lot of activity at “mediadata” app github repository but I will write down my thoughts anyway :slight_smile:

So, we could have an underlying metadata handling system with following features:

  • system default metadata fields (creator, title, date created/modified 
)
  • user defined metadata fields (color of the car, something stupid, something smart 
 )
  • system default metadata profiles (eg. text documents, photos 
)
  • user defined metadata profiles (eg. scientific article)
  • metadata extraction and mapping

Maybe mapping is not necessary but if we want to have consistent search then search by “Creator” should find photos, documents and articles of that person. 
 In this example It means that we need a common “Author/Creator” field which could populate either from user input or could be extracted from photo IPTC or from MS Word document properties.

It’s the same with “Date created”, “Keywords”, “Summary/Description” etc etc.

2 Likes

I had a similar idea earlier this year:

and this can be achieved today by registering plugins with the commenting system. Read the comments, it’s not straight forward to build a universal model.

The GSoC project app is headless and is focused on extracting the information from headers, but the way we send the information back to clients is sort of universal. Plugins for comments could be written to enhance the files app by letting users add new fields, but don’t forget that NC 10 has a workflow engine with automatic tagging which could be seen as some sort of metadata already.

I think what would help are some use cases and some functional requirements if you can.

Hi, does anybody know whether there was any progress regarding editing and searching of file meta data?

We’d like to be able to tag files (with another tag system than the collaborative tags) and search for these tags - and the search should allow the user to “OR” combine multiple search expressions, like article numbers, which are not contained in the file name but in the meta tag, and get a complete list of results with all articles that match one of the article numbers / search expressions.

So any solution for this would be appreciated. Thanks in advance,
~YH

I do not have really clear what you are proposing.

I see the following possibilities:

  • you are asking for Files extended metadata and the possibility to search them with an advanced query system
  • you are asking for the possibility to search system tags with an advanced query system
  • you are asking for both of the above

Maybe a clarification would help


About Files extended metadata, consider the information here:

Generally I am asking for both of the above, although extended metadata would be more like a workaround for the lacking possibility to search system tags with an advanced query system.

Generally the system tag system would be preferred. but I couldn’t find any solution that’s already existing. Is there maybe something I haven’t found yet you could point out to me?

Thank you for you reply!
Best regards,
~YH

although extended metadata would be more like a workaround for the lacking possibility to search system tags with an advanced query system

I do not agree: extended metadata would be really useful e.g. for EXIF data or for bibliographic data.

I think that a tag system (even an advanced one) would be really unfit to manage that kind of information.

In any case I proposed an improved search for tags here:

It was something I was thinking from some time and your note gave me the momentum to write it down. In case you are interested, just comment or upvote.

For the extended metadata, I imagine that it is really more complex.

Hope that helps!

@alenkovich Did you ever progress this extended meta data app? I am also looking to do something similar and curious to see how you got on if you progressed this.
Thanks mate

Hi,

Did someone every done this? It’s still extremly useful (especialy for pictures) to be able to search for exif meta-data. it’s somehow a bit of a shame that a simple exif search is not available in Nextcloud (gallery).

I believe welldone metadata is tough


You’d love exif metadata, I’d need bibliographic metadata (which is a world of its own
), she requires metadata for music, 


I do understand the utility of metadata, yet I think it is not easy at all to tackle


I think it’s a great idea and definitely a useful feature for many use cases. I would actually consider this a killer feature.

Would be great if some standards would be taken into consideration as well in such implementation, i.e. Dublin core.

I think some good conceptual ideas could be derived from the approaches that ECM tools such as Alfresco are taking when handling metadata by applying document types / content models and aspects.

Custom metadata is also a good way to store workflow related information.

4 Likes

I would propose the following conceptual ideas:

  • The user should be able to define “document types” or “aspects”, such as invoice, purchase order, CV, etc.
  • For such document types, a set of custom meta data fields can be defined.
  • Each meta data field can be of a specific type, i.e. string, date, timestamp, integer, decimal, etc.
  • Advanced: meta data fields can be singular or plural
  • The custom meta data fields should appear in the UI (document detail view) and should be editable with appropriate UI widgets (where the user has permissions)
  • document types can be added and dropped for a document
  • multiple document types can be applied to a document
  • if a document type is removed from a document, then the meta data fields should be removed from the document as well
  • technically: each document type (internally) uses a specific prefix, similar to a namespace, for its meta data fields, so that there are no conflicts. A meta data field has a unique name (for example: invoice:customer_no, cv:candidate_no) and a (localizable) label.
  • meta data fields should be accessible via REST API as well
  • meta data fields / document types can be assigned to files as well as folders

Use cases:

  1. Accounting: for invoice document, you would be able to store invoice date, amount, customer id, etc. Maybe even automatically extracted via OCR. (For a similar usecase, see fileee.com application)
  2. Headhunting firms: for CV documents, you would be able to store candidate ID, notes, availability date, etc.
  3. any custom workflow or even external workflow engines (Camunda, Flowable, IFTTT, Zapier, etc.) would be able to store information in these fields, if not in their own.

And as mentioned in my previous comment:

Would be great if some standards would be taken into consideration as well in such implementation, i.e. Dublin core.

I think some good conceptual ideas could be derived from the approaches that ECM tools such as Alfresco or Nuxeo are taking when handling metadata by applying document types / content models and aspects.

3 Likes

Anybody working on this?
Why would you work out something from scratch? There is plenty of free/shareware metadata editors around. Why not integrate one which is as generic as possible? Running from web browser, producing XML files and eating XML schemes.

Anybody?
For research data management proper metadata management is not option or nice to have. It is MUST to have.

Yes absolutely! That would be fantastic!!! Creating custom metadata fields for files would be very useful. I would love to replace fileee, which Next cloud + this functionality!

1 Like

To setup a DMS this absolutely a must have

Give it +1
Any ideas on the way for that?

I would start with this:https://www.dataone.org/software-tools/tags/metadata_editor

For research data management, there is a Zotero, perhaps an option to collaborate with them and provide an app for this in Nextcloud (which also connects to their desktop/mobile clients). Problem is that they want to earn money with server subscriptions

ref:

Maybe this kind of approach would be nice and light (though example is specific)
https://www.dataone.org/software-tools/matt-metadata-authoring-tool

Has there been any progress made on this?

mathiasconradt‘s functional spec (Oct’17) is exactly what I’m looking for in order to be able to use next cloud as a dynamic information model for all my files. I want to get away from static / inflexible folder structures, which I view as the only thing holding me back from using next cloud.

There is a proprietary solution called M-Files which has a really nice implementation of this, which goes on to leverage external services for content analysis. E.g Auto-suggestion of metadata based on file content.

It would be great, say, if the next cloud files module were to be renamed “folders”, and a new module could be added called “objects”. The former essentially presenting the existing static folder-structure view, the latter presenting a dynamic view which would be user-malleable to fit their use-case / queries.

1 Like

I was thinking all that might be required for this would be to create a meta-data directory based on an appropriate existing adaptive object model pattern / framework, which would be leveraged as an index to the files stored in next cloud (I assume each has some form of referencable primary key?).

In terms of GUI, M-Files tries to look like a folder structure, which I don’t think is necessary. I think having a single data table displaying everything (limits / pagination notwithstanding), where you can add / remove the metadata columns you’re interested in, and then sort / filter them with excel/access style quick filters, would result in a very simple yet powerful querying mechanism that MS Office users should already be pretty familiar with.

Further bells & whistles could be incorporated to give some high-level homing in on relevant object sets in the initial view. E.g. You might have a small number of high-level / common metadata types such as bucket/project and documentType. So you could quickly round down to, say, “all the invoices under project 1”. Then once you’re looking at that dataset, you might do further rounding down by more specific metadata.

They key thing would be in being able to create pretty extensive sets of metadata profiles (type & property sets).