[MarkLogic Dev General] Boosting scores based on an attribute

Mike Sokolov sokolov at ifactory.com
Fri Apr 11 05:55:24 PDT 2008


Just out of curiosity:

if a query boils down to (A OR (not A)); does the presence or absence of 
A in a result have an effect on relevance score (and nothing else)?  
I've been assuming it does, and this discussion seems to bear that out...

I imagine this slightly odd construction might be better expressed using 
something that is explicitly for that purpose, though like:
cts:query-boost () that would take an expression evaluated solely for 
the purpose of ranking considerations. I don't know if it would amount 
to anything more than syntactic sugar, but could be more obvious to the 
uninitiated.

-Mike

Michael Blakeley wrote:
> That's one way to look at it. More compactly, the word-query part has 
> been expanded to OR( AND( W, B ), W ): A document will match if it 
> matches the word *and* is a bio, or if it matches the word.
>
> -- Mike
>
> Mattio Valentino wrote:
>> Michael,
>>
>> So, in both cts:and-queries, am I right to assume that the second leaf
>> in each is there a placeholder of sorts ... something that just
>> enables us to call out the hit on the attribute?
>>
>> On Thu, Apr 10, 2008 at 4:19 PM, Michael Blakeley
>> <michael.blakeley at marklogic.com> wrote:
>>> Mattio,
>>>
>>>  Not sure if it helps, but you could consider permanently boosting the
>>> quality of biographies, via xdmp:set-document-quality() or a related
>>> function.
>>>
>>>  If I understand what you're trying to do correctly, you could also 
>>> make
>>> your query test for a matrix of possibilities at runtime. You didn't 
>>> say
>>> what the element name is for your type attribute, but let's say it's
>>> 'document'.
>>>
>>>  let $user-query := 'baseball'
>>>  let $query := cts:or-query((
>>>   let $booster := cts:element-attribute-value-query(
>>>     xs:QName('document'), xs:QName('type'), 'biography'
>>>   )
>>>   for $i in (
>>>     $user-query,
>>>     cts:element-word-query(xs:QName('title'), $user-query, (), 16)
>>>   )
>>>   return (
>>>     $i,
>>>     cts:and-query(($booster, $i))
>>>   )
>>>  ))
>>>  return $query
>>>
>>>  =>
>>>  cts:or-query((
>>>   cts:word-query("baseball", ("lang=en"), 1),
>>>   cts:and-query((
>>>
>>>     cts:element-attribute-value-query(
>>>       xs:QName("document"), xs:QName("type"), "biography",
>>>       ("lang=en"), 1
>>>     ),
>>>     cts:word-query("baseball", ("lang=en"), 1)
>>>   ), ()),
>>>   cts:element-word-query(
>>>     xs:QName("title"), "baseball", ("lang=en"), 16),
>>>   cts:and-query((
>>>
>>>     cts:element-attribute-value-query(
>>>       xs:QName("document"), xs:QName("type"), "biography",
>>>       ("lang=en"), 1
>>>     ),
>>>     cts:element-word-query(
>>>       xs:QName("title"), "baseball", ("lang=en"), 16)
>>>   ), ())
>>>  ))
>>>
>>>  You can probably see why I wrote a query to generate that query :-).
>>>
>>>  Note that I didn't actually boost the score for the biography matches:
>>> they'll be boosted by TF/IDF naturally, as will the title matches 
>>> (assuming
>>> title matches are less frequent, in TF/IDF terms, than word-query 
>>> matches
>>> are). Very often, there isn't much reason to explicitly boost query 
>>> terms.
>>>
>>>  You might also think about creating a field for this query, if it's a
>>> frequently-used search strategy for your application.
>>>
>>>  -- Mike
>>>
>>>  Mattio Valentino wrote:
>>>
>>>>
>>>>
>>>> I'm not sure if I'm going to express this correctly but I hope it's 
>>>> clear.
>>>>
>>>> I have a query that searches for a term.  If the term occurs in a
>>>> title or head element, the score is boosted.
>>>>
>>>> cts:or-query((
>>>>  cts:word-query("baseball", (), 1),
>>>>  cts:element-word-query(
>>>>    (xs:QName("title"), xs:QName("head")), "baseball", (), 16
>>>>  )
>>>> ))
>>>>
>>>> If I have two documents where one has an attribute type="biography",
>>>> can I form the query to return *both* documents but boost the score up
>>>> further on the "biography" one?
>>>>
>>>> This has been stewing for a few days now and I've been trying
>>>> different versions in cq, but I can't see how to put the cts:query
>>>> constructors together to do it.
>>>>
>>>> Thanks for any help,
>>>> Matt
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at developer.marklogic.com
>>>> http://xqzone.com/mailman/listinfo/general
>>>>
>>>
>>>  _______________________________________________
>>>  General mailing list
>>>  General at developer.marklogic.com
>>>  http://xqzone.com/mailman/listinfo/general
>>>
>> _______________________________________________
>> General mailing list
>> General at developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general


More information about the General mailing list