[MarkLogic Dev General] Treating elements as byte strings
Geert.Josten at daidalos.nl
Thu Oct 7 12:47:09 PDT 2010
David is making good points. You will need to define how you want to calculate the size of xml. You could agree that the count of the XML is always a count of the 'normalized' XML (tidied, unquoted/quoted, whitespace stripped, whatever). You can also simply accept the fact that namespace declarations are inserted, you can more or less predict how much is added and perhaps even compensate for that. You could also *try* to string-replace the namespace declarations out of the XML, but I recommend against that. You could also follow Davids idea of preserving some text copy of the document or relevant document part and use that for the size count.
I'd say that just living with the fact that the size is a few bytes off sounds like something that would be acceptable in most cases.
drs. G.P.H. (Geert) Josten
2665 JZ Bleiswijk
T +31 (0)10 850 1200
F +31 (0)10 850 1199
mailto:geert.josten at daidalos.nl
De informatie - verzonden in of met dit e-mailbericht - is afkomstig van Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.
> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of
> Karl Erisman
> Sent: donderdag 7 oktober 2010 20:27
> To: General Mark Logic Developer Discussion
> Subject: [MarkLogic Dev General] Treating elements as byte strings
> I would like to take an element node and treat part of it as
> a string in the same way it was originally declared (lexical
> equivalence, not just semantic equivalence). Here is an
> example that does NOT do what I want:
> declare namespace ns="namespace";
> let $elem := <xml><ns:xml>hi</ns:xml></xml> return xdmp:quote($elem/*)
> => <ns:xml xmlns:ns="namespace">hi</ns:xml>
> This returns a string representing semantically equivalent
> XML, but it differs lexically from the original.
> After $elem is stored as an element node, only its tree
> structure is stored, correct? So the only way for me to do
> what I'm describing would be for *me* to save the string form
> of the element at the time it is declared. Is this correct?
> BTW: As background, the reason I need to do this is to comply
> with a spec that requires computing the "size" of incoming
> data, which may or may not be XML (and the "size" is specific
> to the way the XML is declared -- it is lexically
> significant). The data is sent as part of a larger XML
> element, and by the time it arrives at the module responsible
> for checking the size, it is already in XML. This is fine
> for text nodes (fn:string-length gives the "size"), but not
> for element nodes. If my understanding is correct, I'll need
> to make modifications to lower-level modules so the original
> XML is available.
> General mailing list
> General at developer.marklogic.com
More information about the General