[MarkLogic Dev General] Reg: Diacritic-insensitive lexicons
geert.josten at daidalos.nl
Wed Sep 7 23:40:20 PDT 2011
The cts:frequency function takes only one value. Function mapping may be active, resulting in an implicit flowr loop for each value-match value, making your expression return a sequence of frequencies instead of a single summed frequency count.
Function mapping can be very confusing, so I recommend disabling it by putting the following line in the header of your XQuery:
declare option xdmp:mapping "false";
That will cause MarkLogic Server to complain about your expression. You need to rewrite your expression as follows to make it work:
for $value in cts:element-value-match(xs:QName("name"), "Annie*") [1 to 10]
The sum function cumulates the individual frequency counts, making sure you get a single total..
Van: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] Namens ambika arumugam
Verzonden: donderdag 8 september 2011 7:31
Aan: General MarkLogic Developer Discussion
Onderwerp: Re: [MarkLogic Dev General] Reg: Diacritic-insensitive lexicons
I am running the cts:element-value-match query, I understand that indexes will be created for each unique value in the element 'name'. But is it possible to customize the indexing like creating a same index for Annie, Ánnie and Ànnie. So that if i perform cts:element-value-match query like
cts:element-value-match(xs:QName("name"), "Annie*") [1 to 10]
lets consider, if i have '3' matches for Annie in element 'name' and
'1' match for Ánnie in element 'name and
'2' matches for Ànnie in element 'name' in the database.
cts:frequency(cts:element-value-match(xs:QName("name"), "Annie*") [1 to 10])
Then performing the above query should return result of 6 (summing the individuals - Annie(3),Ánnie (1)m,Ànnie (2)). I also tried options of cts:element-value-match query still without any changes to the indexes i am not able to achieve this result.
Thanks in advance,
On Tue, Sep 6, 2011 at 11:23 AM, Gajanan Chinchwadkar <Gajanan.Chinchwadkar at marklogic.com<mailto:Gajanan.Chinchwadkar at marklogic.com>> wrote:
You mention that you are trying to count element names. But the query you are using seems to find all the elements named "name" whose value starts with "Annie".
Please clarify: what do you want to count exactly?
Also do you have a range index of type "string" set on the element named "name"? Basically the function element-value-match() simply reads all the values in the range index which match pattern "Annie*".
From: general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com> [mailto:general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com>] On Behalf Of ambika arumugam
Sent: Monday, September 05, 2011 9:55 PM
To: General MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Reg: Diacritic-insensitive lexicons
I am trying to get the count of element names using the query
cts:element-value-match(xs:QName("name"), "Annie*",("case-insensitive","collation=http://marklogic.com/collation/","diacritic-insensitive") )[1 to 10]
I am using cts:frequency of the above query to get the results.
For this i want values of Ánnie and Ànnie to match this query using diacritic-insensitive option as the third parameter of element-value-match query. But i am not getting results for this query as expected.
Should the collation be changed from root collation to unicode collation to get this done?
General mailing list
General at developer.marklogic.com<mailto:General at developer.marklogic.com>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the General