This does help... here is what I tried to do:
1) Created a new index in aracne called "wiki_index" and used the default schema and default analyzer
2) Created a new job in scheduler called "wiki_index_job" and told it to index a web site with depth of 1
Ran the job and it completed successfully but didn't save any of the data
3) I modified the wiki_index_job and added an ARN-Index exporter using:
filter_sequence: arn_default
data source: arnIndex
index name: wiki_index
clear index: TRUE
4) I re-ran the wiki_index_job and it completed successfully and stored 38 tuples in the index
5) I went back to the aracne administration tool > AracneIndexer and did a search in my index using the "Search in index" option and I can search the index and I see the "INDEXSCORE" seems to be dynamically populated based on the search keywords and the relevance for the found items in the index (I'm having to guess what INDEXSCORE is because I cannot find this in any of your documentation).
6) I went to Denodo VDP and created a new data source, connecting it to my index server (NOT the crawler)... port 9000... and then I created a new base view.
7) I read the information about creating MAIN TERMS for a document in the Denodo Virtual Data Port administration guide... this was interesting, but I'm not sure I would use the main terms when trying to search for data. I decided to create main terms for the "content" field.
8) Now I have a base view pointing at my "wiki_index" and a content_MAIN_TERM field with an array of registers. I decide to execute a query against my "wiki_index" base view:
SELECT * FROM wiki_index WHERE content contains 'middleware'
I get EXTREMELY similar results to #5 above, however, the index score is different. Why is the index score different? How can I simulate the "Search in index" functionality that exists via /webadmin/denodo-aracne-admin, but do it from VDP? I'd love to have a simple way to MIMIC the simpleness of the search keywords (I can't seem to make the "contains" operator work exactly the same, especially when searching for multiple keywords).
Finally, I can see how to index a web page, but how do I index WORD, EXCEL, and POWERPOINT documents? I see that I can upload my documents to a web server and then point the crawler (via scheduler job) at the web server. Is this the only way?
THANK YOU~!