[MarkLogic Dev General] Facet using cts:search() result

Justin Makeig Justin.Makeig at marklogic.com
Tue Aug 24 08:49:41 PDT 2010


Jonna,
cts:search gives you a list of results based on a cts:query. You can also use a cts:query to limit the scope of a facet, but you don’t use cts:search to generate the values and counts. You’ll need a lexicon for that. The first thing you need to do is to create a range index on the element (or element-attribute combination) that contains your data, for example “color” below. A range index keeps track of the values of a particular element (or element-attribute) in memory, so it’s very fast for look-ups, counts, and aggregates. To access the lexicon, you can use cts:element-values to enumerate the values and cts:frequency to get the counts. For example,

xquery version "1.0-ml";

let $query := cts:and-query(("dog", "cat"))
let $options := (
  'frequency-order',
  'collation=http://marklogic.com/collation/codepoint'
)

for $v in cts:element-values(QName("", "color"), "", $options, $query)[1 to 10]
return
 <facet frequency={cts:frequency($v)}>{$v}</facet>


The above will get the top 10 most frequently occurring values for the “color” element and create output of the form:

<facet frequency="16828">black</facet>
<facet frequency="16752">orange</facet>
<facet frequency="16659">green</facet>
<facet frequency="16629">blue</facet>
<facet frequency="16613">voilet</facet>
<facet frequency="16519">red</facet>

This assumes you have a string range index on the color element that uses the Unicode Codepoint collation. For more about range indexes, please see section 21 of the “Administrator’s Guide” <http://developer.marklogic.com/pubs/4.1/books/admin.pdf>.

Here’s how you’d do roughly the same facet in the Search API

xquery version "1.0-ml”;
import module namespace search = "http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy";

let $options := <options xmlns="http://marklogic.com/appservices/search">
  <constraint name="color">
    <range type="xs:string" collation="http://marklogic.com/collation/codepoint" facet="true">
      <facet-option>limit=10</facet-option>
      <element name="color"/>
      <facet-option>frequency-order</facet-option>
      <facet-option>descending</facet-option>
    </range>
  </constraint>
  <return-results>false</return-results>
  <return-facets>true</return-facets>
</options>

return search:search("dog cat", $options)

which will output something like

<search:response total="100000" start="1" page-length="10" xmlns:search="http://marklogic.com/appservices/search">
  <search:facet name="color">
    <search:facet-value name="black" count="16828">black</search:facet-value>
    <search:facet-value name="orange" count="16752">orange</search:facet-value>
    <search:facet-value name="green" count="16659">green</search:facet-value>
    <search:facet-value name="blue" count="16629">blue</search:facet-value>
    <search:facet-value name="voilet" count="16613">voilet</search:facet-value>
    <search:facet-value name="red" count="16519">red</search:facet-value>
  </search:facet>
  <search:qtext/>
  <search:metrics>
    <search:facet-resolution-time>PT0.005495S</search:facet-resolution-time>
    <search:snippet-resolution-time>PT0S</search:snippet-resolution-time>
    <search:total-time>PT0.035695S</search:total-time>
  </search:metrics>
</search:response>

For this simple example, there is a little more typing with the Search API, but once you get into things like query parsing, auto-suggest, and bucketed constraints  most users appreciate its level of abstraction.

Again, feedback and questions are much appreciated.

Justin


If you’re playing along at home, here’s the script that I used to generate my dummy data set:

xquery version "1.0-ml”;

let $colors := ('red', 'green', 'blue', 'orange', 'voilet', 'black')
let $len := count($colors)
for $i in (1 to 100000)
let $rand := xdmp:random($len -1) + 1
let $d := <doc>
  <description>This references a dog and a cat.</description>
  <color>{$colors[$rand]}</color>
</doc>
return  xdmp:document-insert(string(xdmp:random()),$d)



Justin Makeig
Senior Product Manager
MarkLogic Corporation

email  justin.makeig at marklogic.com<mailto:justin.makeig at marklogic.com>
web    www.marklogic.com<http://www.marklogic.com/>



This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.


On Aug 24, 2010, at 1:25 PM, Jonna Marry wrote:

Hi Justin,

Thanks for your reply.

Can you please give me a sample to get facet in cts:search() function?

Regards,
Jonnna



On Tue, Aug 24, 2010 at 5:22 PM, Justin Makeig <Justin.Makeig at marklogic.com<mailto:Justin.Makeig at marklogic.com>> wrote:
Jonna,
Welcome. The Search API (i.e. the functions that use the “search” namespace prefix by default) is a higher-level abstraction above some of the other built-in APIs, including Core Text Services (“cts”). It provides conveniences like Google-style query parsing, pagination, faceting, and context-sensitive result snippets out-of-the-box. Like the underlying APIs, the Search API is designed for performance at scale. If you’re just getting started, I’d encourage you to begin with the Search API. As your application requirements (and MarkLogic knowledge) grow, you can take advantage of the rich customization and extensibility hooks built into the Search API.
If you haven’t already, I’d encourage you to read the “Search Developer’s Guide” <http://developer.marklogic.com/pubs/4.1/books/search-dev-guide.pdf>. Section 2.2.2, for example, includes a chart comparing the Search API constraints and facets with their equivalent cts:query and lexicon calls.
Again, welcome and please feel free to contact me with questions and/or post them to this discussion list.

Justin


On Aug 24, 2010, at 12:18 PM, Jonna Marry wrote:

Hi,

I am new to Mark logic. And I am analyzing the difference between search:search() and cts:search() functions.

Can we get facet in cts:search() result?

If yes,please let me know how to get facet in results of cts:search as we get in search:search.

Thanks in Advance.

Regards,
Jonna


_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20100824/60b8e603/attachment-0001.html 


More information about the General mailing list