[MarkLogic Dev General] Require suggestions to load and search word
docs
Sukhendra Rai
sukhendra.rai at globallogic.com
Wed Jun 13 05:22:02 PDT 2007
Hi,
I am familiarizing my self with Mark Logic Server and XQuery.
I have to store (load) word documents in the server.
I want to search these documents for particular keywords.
I request for suggestions to find out the best way to load and search
these documents in MarkLogic Server.
Going through the developer guide chapter 11, I found three formats XML,
binary and text. I used xdmp:document-load to load the doc files. If I
try to use XML or text in <format> parameter of xdmp:document-load, a
error is generate stating that "my document is not in the UTF-8 format
while it works fine with binary format. In my opinion, word document
stored in the binary format can not be searched efficiently.
xdmp:document-load does not seems to be automatically converting the
document from any other type to XML format. Is there any function does
this?
I found the xdmp:word-convert
<file:///C:\Documents%20and%20Settings\sukhendra.rai\Desktop\markLogic\M
arkLogic_3.2_pubs\pubs\apidocs\Document-Conversion.html#word-convert>
function to convert the word document in XHTML format. If I need to
store the doc files in XHTML for better searching performance should I
need to first convert and then store them in the server?
Thanks,
Sukhendra Rai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://xqzone.marklogic.com/pipermail/general/attachments/20070613/e8541ed0/attachment.html
More information about the General
mailing list