Problem

You want to know the size of a non-binary document.

Solution

Applies to MarkLogic versions 7+

If we want to know the size of a binary, it’s easy: xdmp:binary-size($doc).

However, there isn’t a special function for non-binary documents. Instead, we can convert it the content to a different format where we can measure the size:

var doc = fn.doc('/test-uri1.xml')
xdmp.binarySize(fn.head(xdmp.unquote(xdmp.quote(doc), null, "format-binary")).root)
let $doc := fn:doc("/test-uri1.xml")
return
  xdmp:binary-size(xdmp:unquote(xdmp:quote($doc),(),"format-binary")/binary())

Discussion

You might be tempted to use fn:string-length(). In some cases, that will work. The critical thing to realize is that fn:string-length() gives you the number of characters, not the size of the document. These can be different, depending on the characters in the document. For instance, consider this simple example:

var doc =
  {
    zoo: {
      lions: "die Löwen"
    }
  }

result = {
  "string": fn.stringLength(xdmp.quote(doc)),
  "binary": xdmp.binarySize(fn.head(xdmp.unquote(xdmp.quote(doc), null, "format-binary")).root)
}
let $doc := 
  <zoo>
    <lions lang="de">die Löwen</lions>
  </zoo>
return (
  fn:string-length(xdmp:quote($doc)),
  xdmp:binary-size(xdmp:unquote(xdmp:quote($doc),(),"format-binary")/binary())
)

This returns (45, 46) for the XML document and {"string": 29, "binary": 30 } for the JSON document. Why? The ö is a 2-character letter.

Note that while the example code uses XML sample input documents, it works just the same for JSON.

Learn More

Application Developer's Guide

Read the methodologies, concepts, and use cases related to application development in MarkLogic Server, with additional resources.

MarkLogic Developer Learning Track

Want to build that awesome app? Get off the ground quickly with the developer track, with instructor-led and self-paced courses.

Getting Started Video Tutorials for Developers

This series of short videos tutorials takes developers who are new to MarkLogic from download to data hub, fast.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.