You want to know the size of a non-binary document.
Applies to MarkLogic versions 7+
If we want to know the size of a binary, it’s easy: xdmp:binary-size($doc)
.
However, there isn’t a special function for non-binary documents. Instead, we can convert it the content to a different format where we can measure the size:
var doc = fn.doc('/test-uri1.xml') xdmp.binarySize(fn.head(xdmp.unquote(xdmp.quote(doc), null, "format-binary")).root)
let $doc := fn:doc("/test-uri1.xml") return xdmp:binary-size(xdmp:unquote(xdmp:quote($doc),(),"format-binary")/binary())
You might be tempted to use fn:string-length()
. In some cases, that will work. The critical thing to realize is that fn:string-length()
gives you the number of characters, not the size of the document. These can be different, depending on the characters in the document. For instance, consider this simple example:
var doc = { zoo: { lions: "die Löwen" } } result = { "string": fn.stringLength(xdmp.quote(doc)), "binary": xdmp.binarySize(fn.head(xdmp.unquote(xdmp.quote(doc), null, "format-binary")).root) }
let $doc := <zoo> <lions lang="de">die Löwen</lions> </zoo> return ( fn:string-length(xdmp:quote($doc)), xdmp:binary-size(xdmp:unquote(xdmp:quote($doc),(),"format-binary")/binary()) )
This returns (45, 46)
for the XML document and {"string": 29, "binary": 30 }
for the JSON document. Why? The ö is a 2-character letter.
Note that while the example code uses XML sample input documents, it works just the same for JSON.
By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.