[MarkLogic Dev General] Stemmed relevance scoring

Danny Sokolsky Danny.Sokolsky at marklogic.com
Mon Jun 7 09:17:10 PDT 2010


Hi Cara,

If you need to do this, you can enable word-searches and do an or-query of the stemmed search and the unstemmed search (you will have to specify the query options "stemmed" and "unstemmed" in the respective cts:query constructors).  That should let the one with the exact match contribute to score.

Think about if that is really what you want to do, though.  Especially when you end up with a large corpus of documents, I am not sure how much that will change the score, and stemming is really about increasing search recall.  So think about your assumption that a document that contains "ran" is more relevant than one that contains "run".  In many cases, that is not necessarily true.

-Danny

From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Caraliza Fonseca-Ensor
Sent: Monday, June 07, 2010 5:30 AM
To: general at developer.marklogic.com
Subject: [MarkLogic Dev General] Stemmed relevance scoring

Hi

Is it possible to run a search which assigns lower scores to documents containing stems of the search terms, so that they have a lower relevance than documents which contain the exact search term?
e.g.
Document 1:
<test>the word is run</test>

Document 2:
<test>the word is ran</test>

Document 3:
<test>the word is running</test>

Query:
cts:search(/test, cts:element-query(xs:QName("test"), "ran"))

Is there a way to ensure that Document 2 is given the highest relevance score, as it is the only document containing the exact search term?

Thanks,
Cara
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20100607/5bd7eb5a/attachment-0001.html 


More information about the General mailing list