[MarkLogic Dev General] Sorting pinyin text?
Mary Holstege
mary.holstege at marklogic.com
Tue Apr 1 07:10:56 PST 2008
On Tue, 01 Apr 2008 07:05:45 -0700, Marc Moskowitz
<mmoskowitz at ifactory.com> wrote:
> I'm trying to sort transliterations of Chinese words by standard pinyin
> sorting (syllable alphabetically, then by tone, followed by the next
> syllable). Is there a collation in either English or Chinese that deals
> correctly with this? If not, is there some way of creating a
> user-defined sort order? I know that I can create a sortable form for
> each word that sorts correctly by codepoint, but I would rather do
> something more efficient if possible.
> Marc Moskowitz
> Interactive Factory
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
The collation named "http://marklogic.com/collation/zh" ought to
do what you want. Pinyin is the default ordering for (mainland)
Chinese. There is no way of defining your own collation. In
theory you could write your own ordering function that operated
on the strings, but it would be fairly painful and slow I imagine.
//Mary
Mary Holstege
Lead Engineer
Mark Logic Corporation
More information about the General
mailing list