[MarkLogic Dev General] arabic and other right to left scripts
Mary Holstege
mary.holstege at marklogic.com
Fri Aug 3 16:36:12 PDT 2007
On Fri, 03 Aug 2007 14:00:25 -0700, Alan Darnell
<alan.darnell at utoronto.ca> wrote:
> I've got a few records in my database in languages / scripts that read
> from right to left.
> If I cut and paste some of this text into a search box, I get no
> results. I'm just
> wondering how MarkLogic stores these kinds of alphabets and is there
> something I
> need to do in my XQuery to let the system know that something different
> is headed its way?
>
> Alan
> University of Toronto
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
Right to left scripts get stored in "logical" order: the right-to-leftness
is a matter for the renderer (e.g. the browser). Unfortunately, browsers
have a lot of bugs in this area and are easily confused. Putting the
attribute dir="rtl" in your HTML can help some browsers. You may also
run afoul of system clipboards deciding to flop character encodings on
you without warning. As far as the XQuery is concerned, as long as
you're sending the correct Unicode codepoints (properly encoded), things
should work just fine.
As far as troubleshooting goes, calling xdmp:describe() on your query
string to make sure it is the codepoints you think it is is a good start.
//Mary
More information about the General
mailing list