[MarkLogic Dev General] loading XML documents with DTDs
Michael Blakeley
michael.blakeley at marklogic.com
Tue Jun 19 10:08:08 PDT 2007
Alan,
I'd recommend starting with xdmp:document-load() -
http://developer.marklogic.com/pubs/3.2/apidocs/UpdateBuiltins.html#document-load
You might also be interested in
http://developer.marklogic.com/howto/tutorials/2006-06-recordloader.xqy
-- Mike
Alan Darnell wrote:
> I have a number of documents (sample below) in XML format but not UTF-8
> encoding and with an externally referenced DTD and rendering
> stylesheet. What's the best way to get these documents into MarkLogic
> so that:
>
> - the encoding is changed to UTF-8
> - any entities in the DTD are resolved to UTF-8 encoded characters
> - any CDATA sections are removed with the content left intact, including
> markup embedded in the CDATA content
>
> Do I need to pre-process the files before loading or can Mark logic
> handle these kinds of conversion as part of the load functions?
>
> Also, does anyone know of any good strategies for converting math in TeX
> format to MathML?
>
> Thanks,
>
> Alan
>
> Alan Darnell
> University of Toronto
>
>
>
> <?xml version="1.0" encoding="iso-8859-1"?><?xml-stylesheet
> type="text/xsl" href="file://batchgate1\StyleS\bpg4
> 0.xsl"?>
> <!DOCTYPE content PUBLIC "-//BLACKWELL PUBLISHING GROUP//DTD 4.0//EN"
> "\\Batchgate1\bpgdtd\4-0\bpg4-0.dtd">
> <content dtdver="4.0" docfmt="xml">
More information about the General
mailing list