[MarkLogic Dev General] document format

Mike Sokolov sokolov at ifactory.com
Mon Mar 31 09:34:18 PST 2008


I have been trying to come up with a way to determine the "format" of a 
document in MarkLogic. The only api call that seems directly related is 
xdmp:document-uri-format, but this seems to operate on the uri without 
any reference to the contents of a document.  Instead, I tried testing:

node-kind(doc($uri)/node()[1])


but we just found an XML document for which this returns "text" - 
apparently it has a BOM at the start, so the document node has two child 
nodes: one text (containing the BOM) and one element (the root element). 
Presumably there could be comments there too and processing 
instructions, so this strategy is clearly flawed.

Does anybody have a good way to determine whether a document in Mark 
Logic is an XML document, a text document or a binary document?

-Mike
 


More information about the General mailing list