[MarkLogic Dev General] Word Boundaries in Chinese?
Michael Sokolov
sokolov at ifactory.com
Wed May 7 15:39:59 PDT 2008
sorry for the list spam - trying to reply to Marc only
> -----Original Message-----
> From: general-bounces at developer.marklogic.com
> [mailto:general-bounces at developer.marklogic.com] On Behalf Of
> Michael Sokolov
> Sent: Wednesday, May 07, 2008 6:39 PM
> To: 'General Mark Logic Developer Discussion'
> Subject: RE: [MarkLogic Dev General] Word Boundaries in Chinese?
>
> Marc - it looks as if all the useful information in your
> e-mail got stripped out by a mail demon.
>
> Also - I suggest you take this one up w/support instead, or
> in addition.
>
> -Mike
>
> > -----Original Message-----
> > From: general-bounces at developer.marklogic.com
> > [mailto:general-bounces at developer.marklogic.com] On Behalf Of Marc
> > Moskowitz
> > Sent: Wednesday, May 07, 2008 5:31 PM
> > To: General Mark Logic Developer Discussion
> > Subject: [MarkLogic Dev General] Word Boundaries in Chinese?
> >
> > I'm seeing some odd behavior when searching for text in Chinese. It
> > seems that the server is making decisions about word
> boundaries based
> > on some internal criteria.
> >
> > This XQuery:
> > let $q := '?',
> > $doc := (
> > <yo>???</yo>,
> > <yo>??</yo>,
> > <yo>??</yo>,
> > <yo>?????</yo>)
> > for $d in $doc
> > let $h := cts:highlight($d, $q, <hey>{$cts:text}</hey>) return
> > (count($h//hey), $h)
> >
> > produces this result:
> >
> > 0
> > <yo>???</yo>
> > 1
> > <yo><hey>?</hey>?</yo>
> > 0
> > <yo>??</yo>
> > 1
> > <yo>????<hey>?</hey></yo>
> >
> >
> > Is there some way of affecting where these boundaries are
> placed? Or
> > of turning this functionality fully on or off?
> > -Marc
> >
> > _______________________________________________
> > General mailing list
> > General at developer.marklogic.com
> > http://xqzone.com/mailman/listinfo/general
> >
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
>
More information about the General
mailing list