This new release support even more storage as Nextant now indexes your federated files so, with this new feature, the app will index all the files of your cloud:
external storage
server-side encrypted storage
federated cloud
your bookmarks
The sharing rights are also indexed meaning that you can search within locally shared documents (or over any type of storage). Even your public links can now provide searches to anonymous visitors.
Changelog
search within federated documents
search within shared-link documents, both public and private (w/ key)
admin can choose to add files and directory structure to improve search
Wow. I’m already impressed (and happy) that Nextant gives me the Evernote-like ability to search images for specific phrases. Does the new audio file support similarly actually do speech recognition (which would be super impressive) or is it more of a scan-the-metadata file of operation? (still useful)
Also, is there any chance of limiting ocr/speech recognition to files with specific tags? I have many gigabytes of family photos with no text in them, and it seems a waste of resources to ocr all of them to get the few images I do have with text…
We’re using Solr, and Solr is using Tika to extract data:
Tika can detect several common audio formats and extract metadata from them. Even text extraction is supported for some audio files that contain lyrics or other textual content. The AudioParser and MidiParser classes use standard javax.sound features to process simple audio formats, and the Mp3Parser class adds support for the widely used MP3 format.
That feature is interesting, but Is it still good for you if you could add a list of tags of file you don’t want to be extracted ? If so, please open an issue
It’s really only useful to me to be able tags to the files I do want extracted. Doing it the other way around means I’d need to tag around 50,000 files, just to get it to only scan around 20.
You should still fill an issue, I’ll find a way. But with your kind of setup, it will requiert some complex configuration: I guess you want to filter Image AND tag, but still index the rest of your text document with no tag?