[MarkLogic Dev General] search chinese word can not work when used wildcard query

Dave Cassel Dave.Cassel at marklogic.com
Thu Jul 16 08:22:30 PDT 2015


I asked around and Mary gave me this information (apparently wildcards are tricky with Chinese tokenization):

I think what is going on here is that  "哈*" is asking for the value to be a single word starting with that character, but the value "哈哈" is tokenized as two words.
You have to do this: cts:element-attribute-value-query(xs:QName("product"),xs:QName("tmp")," 哈* *",("wilcarded","whitespace-insensitive"))

Dave Cassel<http://davidcassel.net>, @dmcassel<https://twitter.com/dmcassel>
Technical Community Manager
MarkLogic Corporation<http://www.marklogic.com/>

From: <general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com>> on behalf of 张晓博 <zisedeqing at 163.com<mailto:zisedeqing at 163.com>>
Reply-To: MarkLogic Developer Discussion <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Date: Monday, July 13, 2015 at 3:56 AM
To: "general at developer.marklogic.com<mailto:general at developer.marklogic.com>" <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Subject: [MarkLogic Dev General] search chinese word can not work when used wildcard query

some node of xml document is:
    <product dept="ACC" tmp="哈哈">
        <name language="en">Floppy Sun Hat</name>
the language of attribute tmp is chinese.

the query :
cts:and-query((cts:element-attribute-value-query(xs:QName("product"), xs:QName("tmp"), '哈*', "wildcarded"))))/name

will return the empty result.
but the below query will return the right result:
cts:and-query((cts:element-attribute-value-query(xs:QName("product"), xs:QName("dept"), 'A*', "wildcarded"))))/name
result is:
<name language="en">Floppy Sun Hat</name>

why the first query return the empty result?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20150716/3ff93f61/attachment-0001.html 

More information about the General mailing list