[MarkLogic Dev General] Can vaidation of XML docs in a zipfileextraction be disabled?
Geert Josten
Geert.Josten at daidalos.nl
Tue Oct 26 06:56:27 PDT 2010
Hi Tim,
You can pass in a format option to get the specific file, details in the api description: http://developer.marklogic.com/pubs/4.2/apidocs/Document-Conversion.html#xdmp:zip-get
It is not recommended to fix the file with string manipulation though. Perhaps the repair option is a better option. Best ofcourse would be to fix the problem at the source, but that is perhaps not an option in your case..
Kind regards,
Geert
> -----Original Message-----
> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of
> Tim Meagher
> Sent: dinsdag 26 oktober 2010 15:52
> To: 'General Mark Logic Developer Discussion'
> Cc: 'Asheesh Mangla'
> Subject: Re: [MarkLogic Dev General] Can vaidation of XML
> docs in a zipfileextraction be disabled?
>
> Hi Geert,
>
>
>
> Hmm ... you're right - there is some bad text at the end of
> this file that is contributing to the problem, and this
> particular document is not a well-formed XML document.
>
>
>
> Any suggestions for extracting it as a non-XML document (e.g.
> UTF-8 text) so that it can be corrected and subsequently
> saved as an XML document?
>
>
>
> Thanks!
>
>
>
> Tim
>
>
>
> -----Original Message-----
> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of
> Geert Josten
> Sent: Tuesday, October 26, 2010 9:33 AM
> To: General Mark Logic Developer Discussion
> Cc: 'Asheesh Mangla'
> Subject: Re: [MarkLogic Dev General] Can vaidation of XML
> docs in a zipfileextraction be disabled?
>
>
>
> Hi Tim,
>
>
>
> Are you sure this is a validation error message? Could it be
> that the zip file contains a mixture of xml and non-xml, and
> that you are trying to extract a file from the zip as xml
> while it is actually non-xml?
>
>
>
> Kind regards,
>
> Geert
>
>
>
> >
>
>
>
>
>
> drs. G.P.H. (Geert) Josten
>
> Consultant
>
>
>
> Daidalos BV
>
> Hoekeindsehof 1-4
>
> 2665 JZ Bleiswijk
>
>
>
> T +31 (0)10 850 1200
>
> F +31 (0)10 850 1199
>
>
>
> mailto:geert.josten at daidalos.nl
>
> http://www.daidalos.nl/
>
>
>
> KvK 27164984
>
>
>
>
>
> De informatie - verzonden in of met dit e-mailbericht - is
> afkomstig van Daidalos BV en is uitsluitend bestemd voor de
> geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen,
> verzoeken wij u het te verwijderen. Aan dit bericht kunnen
> geen rechten worden ontleend.
>
>
>
> > From: general-bounces at developer.marklogic.com
>
> > [mailto:general-bounces at developer.marklogic.com] On Behalf Of
>
> > Tim Meagher
>
> > Sent: dinsdag 26 oktober 2010 15:15
>
> > To: 'General Mark Logic Developer Discussion'
>
> > Cc: 'Asheesh Mangla'
>
> > Subject: [MarkLogic Dev General] Can vaidation of XML docs in
>
> > a zipfile extraction be disabled?
>
> >
>
> > I'm loading a zipfile that contains multiple XML documents
>
> > into MarkLogic, but it appears that MarkLogic is validating
>
> > the embedded content against its corresponding schema in the
>
> > Schemas database and coming up with an invalid root text
>
> > error message when extracting the xml document:
>
> >
>
> >
>
> >
>
> > <error:message>Invalid root text</error:message>
>
> >
>
> > <error:format-string>XDMP-DOCROOTTEXT:
>
> > xdmp:zip-get(fn:doc($doc-uri)).
>
> >
>
> >
>
> >
>
> > This prevents me from being able to stored a well-formed XML
>
> > document and to be able to correct it in MarkLogic, which
>
> > means that the content must be extracted either manually or
>
> > via a non-MarkLogic application and then corrected before
>
> > reinserting into MarkLogic.
>
> >
>
> >
>
> >
>
> > Thanks for the help!
>
> >
>
> >
>
> >
>
> > Tim Meagher
>
> >
>
> >
>
> >
>
> >
>
> _______________________________________________
>
> General mailing list
>
> General at developer.marklogic.com
>
> http://developer.marklogic.com/mailman/listinfo/general
>
>
More information about the General
mailing list