[MarkLogic Dev General] RE: Language support using search:search

Colleen Whitney Colleen.Whitney at marklogic.com
Tue Mar 16 08:26:59 PST 2010


Adam, you can specify language as a <term-option>, which can apply either as a child of the <term> element or of a <constraint>. (see the search:search() documentation for a little more information).  

Try this:

import module namespace search = "http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy";

let $options :=         
 <options xmlns="http://marklogic.com/appservices/search">
 <constraint name="title">
  <word>
   <element ns="" name="title"/>
  </word>
  <term-option>lang=de</term-option>
 </constraint>
<term apply="term">
<empty apply="all-results"/>
<term-option>lang=de</term-option>
</term>
</options>

return search:parse("title:foo dog",$options)

--Colleen
________________________________________
From: general-bounces at developer.marklogic.com [general-bounces at developer.marklogic.com] On Behalf Of Adam Patterson [aj2patterson at uwaterloo.ca]
Sent: Tuesday, March 16, 2010 9:12 AM
To: general at developer.marklogic.com
Subject: [MarkLogic Dev General] Language support using search:search

Hi,

I’m running MarkLogic Server Standard Edition 4.1-3 on Windows Server 2003.

I have just started exploring MarkLogic’s search API. I’m using search:search to perform full-text searches against a single document in our database. This document contains content in both English (xml:lang=”en”) and German (xml:lang=”de”). The default language of our instance of MarkLogic Server is English. My goal is to use search:search to perform searches against our content and return results matching both the English or German content. However, currently I am only receiving results matching against English content, the default behaviour, and I cannot seem to alter this behaviour. I have read Section 17 in MarkLogic Server Search Developer’s Guide, but this document only deals with multilingual support when constructing your own queries via cts:search methods, whereas I am interested in using the more out-of-the-box functionality provided by the search:search method.

Here is a simplified version of the code I’m using:


let $queryString := xdmp:get-request-field("queryString"),
$options :=         <options xmlns="http://marklogic.com/appservices/search">

                                                <constraint name="languages">
                                                                <annotation>
                                                                                <term-option>lang=de</term-option>
                                                                </annotation>
                                                </constraint>

                                                <debug>{fn:true()}</debug>

                                                <return-query>{fn:true()}</return-query>

                                                <additional-query>{cts:document-query(“/C/Documents and Settings/Administrator/Desktop/diaries_schema/BreithauptDiaries.xml”)}</additional-query>

                                                <searchable-expression xmlns:tei="http://www.tei-c.org/ns/1.0">/tei:teiCorpus/tei:TEI[@xml:id=&quotdiary-ljb-1879-1881-1&quot;]/tei:text[@type=&quotdiary&quot;]/tei:body/tei:div</searchable-expression>
                                </options>,
$results := search:search($queryString, $options)
return $results

When I do a search with query text “pa” and look at the search:report returned I get the following:

<search:report id="SEARCH-FLWOR">
                    (cts:search(/tei:teiCorpus/tei:TEI[@xml:id="diary-ljb-1879-1881-1"]/tei:text[@type="diary"]/tei:body/tei:div, cts:and-query((cts:word-query("pa", ("lang=en"), 1), cts:document-query("/C/Documents and Settings/Administrator/Desktop/diaries_schema/BreithauptDiaries.xml")), ()), ("score-logtfidf"), 1))[1 to 10]
</search:report>

This SEARCH_FLWOR shows that search:search is constructing the underlying cts:word-query with “lan=en”, the server’s default language, but I’m trying to determine how to get search:search to construct underlying cts queries with language determined by me (see the term-option constraint I’ve included in the options argument). Ideally I would be able to set the language in the underlying query constructors to be “lang=de” and equivalent queries constructed with “lang=en” and then combine the results with cts:or-query.

If anyone has any thoughts or advice for me I’d greatly appreciate it. Please let me know if it’s possible to conduct multilingual searches using the search:search method and the approach I’ve outlined here.

Thank you,

Adam Patterson
Library Systems Development
University of Waterloo




More information about the General mailing list