[MarkLogic Dev General] hyphens and cts:element-value-query

Christopher Hamlin cbhamlin at gmail.com
Tue Feb 28 11:58:20 PST 2017


When you run

xdmp:plan (cts:search (/, cts:element-value-query(xs:QName('ename'),
'value-1', 'exact')))

element(ename,value("value","1"))

or

element(ename,value("value","-","1"))

?  I'm not sure why, but I see the former in 8.0-6, and the latter in
8.0-5.10 and 8.0-6.1.  Are you in a position to try an upgrade?

On Tue, Feb 28, 2017 at 2:49 PM, Gary Larsen <gary.larsen at envisn.com> wrote:

> Learning more than usual today J
>
>
>
> "collation=URI" is an option on element-range-query(), but not on
> element-value-query().  Looks like creating an range index would be useful
> for elements which may have spaces or punctuation and need exact matching..
>
>
>
> Thanks, Gary
>
>
>
> *From:* general-bounces at developer.marklogic.com [mailto:general-bounces@
> developer.marklogic.com] *On Behalf Of *Geert Josten
> *Sent:* Tuesday, February 28, 2017 2:28 PM
> *To:* MarkLogic Developer Discussion
>
> *Subject:* Re: [MarkLogic Dev General] hyphens and cts:element-value-query
>
>
>
> In defense of Andreas, Mary does write this too:
>
>
>
> “At the boundary, where you specify exact unstemmed value
>
> queries or exact range queries *with a codepoint collation*,
>
> the results will line up. For exact queries there are universal
>
> index entries for the value that include punctuation and
>
> whitespace, but we don't index those tokens otherwise."
>
>
>
> E.g. it might work if you select codepoint collation ("collation=
> http://marklogic.com/collation/codepoint”) together with the “exact”
> option. MarkLogic defaults to using its own root collation.
>
>
>
> *From: *<general-bounces at developer.marklogic.com> on behalf of James Kerr
> <James.Kerr at marklogic.com>
> *Reply-To: *MarkLogic Developer Discussion <general at developer.marklogic.
> com>
> *Date: *Tuesday, February 28, 2017 at 7:34 PM
> *To: *MarkLogic Developer Discussion <general at developer.marklogic.com>
> *Subject: *Re: [MarkLogic Dev General] hyphens and cts:element-value-query
>
>
>
> From Mary’s response: “Word tokens may be stemmed and punctuation and
> space tokens are *not* indexed” (emphasis my own).
>
>
>
> The fact that punctuation and space tokens are not indexed is why you
> cannot do punctuation-sensitive or whitespace-sensitive, unfiltered word or
> value queries.
>
>
>
> Depending on what you are trying to accomplish, custom tokenization (
> https://docs.marklogic.com/guide/search-dev/custom-tokenization) may be a
> good option for you.
>
>
>
> On a side note, can you share what you are doing for your predicate check?
> By adding a check like this, you are essentially just implementing your own
> filtered search so it’s unclear what the benefit would be over just using
> the “filtered” search option.
>
>
>
> -James
>
>
>
>
>
> *From: *<general-bounces at developer.marklogic.com> on behalf of Gary
> Larsen <gary.larsen at envisn.com>
> *Reply-To: *MarkLogic Developer Discussion <general at developer.marklogic.
> com>
> *Date: *Tuesday, February 28, 2017 at 1:12 PM
> *To: *'MarkLogic Developer Discussion' <general at developer.marklogic.com>
> *Subject: *Re: [MarkLogic Dev General] hyphens and cts:element-value-query
>
>
>
> Geert and Andreas,
>
>
>
> Thanks for pointing out tokens vs. values that I wasn’t understanding.
>
>
>
> Using ‘filtered’ in cts:search works, but I’ve always tried to avoid that
> for performance reasons.  In this case I’ve added a predicate check in the
> result instead.
>
>
>
> But to Andreas’s point, it seems that ‘exact’ or ‘punctuation-sensitive’
> should be able to match, or maybe I’m not understanding the documentation
> for cts:element-value-query.  If it did work I guess there would be extra
> work un-tokenizing?
>
>
>
> I using ML version 8.0-6
>
>
>
> Thanks for any clarification,
>
> Gary
>
>
>
>
>
> *From:* general-bounces at developer.marklogic.com [mailto:general-bounces@
> developer.marklogic.com <general-bounces at developer.marklogic.com>] *On
> Behalf Of *Andreas Hubmer
> *Sent:* Tuesday, February 28, 2017 8:23 AM
> *To:* MarkLogic Developer Discussion
> *Subject:* Re: [MarkLogic Dev General] hyphens and cts:element-value-query
>
>
>
> Hi Geert,
>
>
>
> As far as I know there are index entries for "exact" queries in the
> universal index, that include punctuation and whitespace. Thus, Gary's
> value queries should work unfiltered.
>
>
>
> There is an email by Mary Holstege supporting my assumption:
> http://developer.marklogic.com/pipermail/general/2013-March/012552.html
>
>
>
> Cheers,
>
> Andreas
>
>
>
>
>
>
>
> 2017-02-28 13:58 GMT+01:00 Geert Josten <Geert.Josten at marklogic.com>:
>
> Hi Gary,
>
>
>
> Sounds like you are running an unfiltered search. Either enable filtering
> to get rid of false positives, or switch to using element-range-query
> (which requires a range index). Keep in mind that value-queries don’t use
> range indexes (even if available), but rely on the universal index, which
> contains tokens, not values..
>
>
>
> Cheers,
>
> Geert
>
>
>
> *From: *<general-bounces at developer.marklogic.com> on behalf of Gary
> Larsen <gary.larsen at envisn.com>
> *Reply-To: *MarkLogic Developer Discussion <general at developer.marklogic.
> com>
> *Date: *Monday, February 27, 2017 at 10:01 PM
> *To: *'General MarkLogic Developer Discussion' <
> general at developer.marklogic.com>
> *Subject: *[MarkLogic Dev General] hyphens and cts:element-value-query
>
>
>
> I’m trying to get this cts query to treat hyphens as text:
>
>
>
> cts:element-value-query(xs:QName(ename), 'value 1', ‘exact’)
>
> cts:element-value-query(xs:QName(ename), 'value-1', ‘exact’)
>
>
>
> Even though the ename  value-1 does not exist a match is found.
>
>
>
> Thanks,
>
> Gary
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
>
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20170228/55fbd0f5/attachment-0001.html 


More information about the General mailing list