[MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive?
Danny Sokolsky
Danny.Sokolsky at marklogic.com
Fri Apr 30 15:12:03 PDT 2010
Yes, under "alternate characters" select "Ignore punctuation". If the collation builder does not allow you to specify what you want, you can just use the rules in the doc to create a URI that is exactly what you want. A good way to test the collation URI is to use cq and write a small program that declares a default collation and then does some string comparison. For example, using the collation builder I built a case and diacritic-insensitive, plus ignore punctuation collation and then I tried the following:
xquery version "1.0-ml";
declare default collation
"http://marklogic.com/collation/en/S1/T00BB/AS";
"foo, bar" eq "foo bar"
(: return true :)
-Danny
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Tim Meagher
Sent: Friday, April 30, 2010 12:19 PM
To: 'General Mark Logic Developer Discussion'
Subject: Re: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive?
There's an explicit option for a punctuation-sensitive index in the collation builder, but not for a punctuation-insensitive index. If not specified, is it implicitly punctuation-insensitive? If not implicit, then does one need to adjust the alternate character settings?
________________________________
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Danny Sokolsky
Sent: Friday, April 30, 2010 3:05 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive?
Yes, build a range index with a case-insensitive, diacritic-insensitive, punctuation-insensitive collation. You can look in the "Encodings and Collations" chapter of the Search Developer's guide to figure out the collation URI, or you can use the little widget in the Admin Interface to build the right collation URI (which will probably be easier...). The widget is in a couple of places (I think on the database config page and on the App Server config page)-it is called "collation builder".
-Danny
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Tim Meagher
Sent: Friday, April 30, 2010 10:59 AM
To: 'General Mark Logic Developer Discussion'
Subject: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive?
Hi folks,
I have a query that seems to be kind of slow - it uses cts:element-value-query() with the case-insensitive, diacritic-insensitive, and punctuation-insensitive options. However, I want to speed up the search by creating a range element index on the elements of interest. I noticed, however, that the corresponding lexicon query would be cts::element-value-match(), but it only provides case-insensitive and diacritic-insensitive search options but not the punctuation-insensitive option. How can I make this lexicon query punctuation-insensitive? Can I do it by building a custom collation with the alternate characters setting set to "avoid punctuation"?
Thank you!
Tim Meagher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20100430/41d2c44c/attachment-0001.html
More information about the General
mailing list