[MarkLogic Dev General] search:search() - why only 1 search match per XML doc

Danny Sinang d.sinang at gmail.com
Fri Mar 23 09:46:09 PDT 2012


Hi Will,

Someone else wrote our search module (that uses search:search) and we
discovered today that it returns only 1 search match even if the word being
looked up occurs several times in the 'htmlBody' element of a particular
XML document.

We were hoping the search:search would return all matches within the
'htmlBody' element.

But it looks like search:search won't do so because 'htmlBody' contains
escaped html. If it were unescaped, then search:search would return all the
matches for the word we're looking for.

I don't know if search:search can be told to treat the contents of
'htmlBody' as unescaped.

So what I'm trying to do is get search:search to return all the matches
within the escaped string content of 'htmlBody'.

Regards,
Danny

On Fri, Mar 23, 2012 at 12:37 PM, Will Thompson
<wthompson at jonesmcclure.com>wrote:

>  Danny,****
>
> ** **
>
> Without more details I’m not sure what you’re trying to do exactly, but it
> sounds like you may need to write your own snippet module.****
>
> ** **
>
> -Will****
>
> ** **
>
> ** **
>
> *From:* general-bounces at developer.marklogic.com [mailto:
> general-bounces at developer.marklogic.com] *On Behalf Of *Danny Sinang
> *Sent:* Friday, March 23, 2012 8:22 AM
> *To:* general
> *Subject:* [MarkLogic Dev General] search:search() - why only 1 search
> match per XML doc****
>
> ** **
>
> Hello.****
>
> ** **
>
> Am trying to search for the word '*populations*' in an XML doc which
> mentions that word around 5 times in its htmlBody element.****
>
> ** **
>
> search:search() returns only the first occurrence of that word in that
> element.****
>
> ** **
>
> Is there an option or way to make search:search return matches for the
> other occurrences of population ?****
>
> ** **
>
> Note that the contents of the htmlBody element (shown below) is a string.*
> ***
>
> ** **
>
> Regards,****
>
> Danny****
>
> ** **
>
> <htmlBody>&lt;body xmlns="http://www.w3.org/1999/xhtml"&gt;****
>
>  &lt;div&gt;****
>
>   &lt;div&gt;****
>
>    &lt;h5&gt;Control of Bacterial Populations&lt;/h5&gt;****
>
>    &lt;p class="Indent00" id="xpp-2014582732321794086-1"&gt;The diseases and many kinds of environmental problems caused by bacteria are actually population control problems. Small numbers of bacteria cause little harm. However, when the population increases, their negative effects are multiplied. Despite large investments of time and money, scientists have found it difficult to control bacterial *populations*. Three factors operate in favor of the bacteria: their reproductive rate, their ability to form resistant stages, and their ability to mutate and produce strains that resist antibiotics and other control agents.&lt;/p&gt;****
>
>    &lt;p class="Indent01" id="xpp-2014582732321794086-2"&gt;Under ideal conditions, some bacteria can grow and divide every 20 minutes. If one bacterial cell and all its offspring were to reproduce at this ideal rate, in 48 hours there would be 2.2 &amp;times; 10 43 cells. In reality, bacteria cannot achieve such incredibly large *populations*, because ... </htmlBody>****
>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120323/5efd2526/attachment.html 


More information about the General mailing list