[XQZone General] cts:google-element-query
Travis Raybold
travis at raybold.com
Thu Mar 2 12:34:29 PST 2006
that's perfect Danny, thanks!
the NOT functionality should be relatively simple for me to add.
--Travis
Danny Sokolsky wrote:
>Hi Travis,
>
>Here is a function that is a simplified version of some code submitted
>by our friends at O'Reilly. This does the first part of what you are
>looking for--turns quoted phrases into phrase terms, and then tokenizes
>the rest on spaces.
>
>You should be able to expand this technique to handle NOTs or anything
>else you want. The idea is to create a simple XML structure that
>contains all of your tokenized terms, like the following:
>
>hello goodbye "hello goodbye"
>
>=> becomes this XML structure
>
><tokens>
> <token>hello</token>
> <token>goodbye</token>
> <token>hello goodbye</token>
></tokens>
>
>You can then use that to create your cts:query expression.
>
>Here is the function to get your query string and turn it into a simple
>xml structure:
>
>define function get-query-tokens($input as xs:string?) as element() {
>(: This parses the quotes to be exact matches.
> The idea for this comes from our friends at o'reilly
> /xqzone/search/trunk/search/query-xml/query-xml.xqy :)
><tokens>{
>let $newInput := fn:string-join(
>(: check if there is more than one double-quotation mark. If there is,
> tokenize on the double-quotation mark ("), then change the spaces
> in the even tokens to the string "!+!". This will then allow later
> tokenization on spaces, so you can preserve quoted phrases as phrase
> searches (after re-replacing the "!+!" strings with spaces). :)
> if ( fn:count(fn:tokenize($input, '"')) > 2 )
> then ( for $i at $count in fn:tokenize($input, '"')
> return
> if ($count mod 2 = 0)
> then fn:replace($i, "\s+", "!+!")
> else $i
> )
> else ( $input ) , " ")
>let $tokenInput := fn:tokenize($newInput, "\s+")
>
>return (
>for $x in $tokenInput
>where $x ne ""
>return
><token>{fn:replace($x, "!\+!", " ")}</token>)
>}</tokens>
>}
>
>
>Hope that helps get you on the right path.
>-Danny
>
>-----Original Message-----
>From: general-bounces at xqzone.marklogic.com
>[mailto:general-bounces at xqzone.marklogic.com] On Behalf Of Travis
>Raybold
>Sent: Thursday, March 02, 2006 11:45 AM
>To: general at xqzone.marklogic.com
>Subject: [XQZone General] cts:google-element-query
>
>
>I'm creating a search interface, and I'm looking to tokenize the search
>string, keep anything in quotes as a single term, default to AND for the
>
>rest of the terms, and be able to handle NOT before a term. I think I
>can work this out, but surely it must be a common type of functionality,
>
>and a sample would probably save me hours of trudging... does anyone
>have
>a sample they'd be willing to post of something similar to this? If not,
>
>I'll develop it and post it back here when I'm done.
>
>Somehow the server didn't recognize cts:google-element-query... ;)
>
>Thanks,
>
>--Travis
>
>_______________________________________________
>General mailing list
>General at xqzone.marklogic.com http://xqzone.com/mailman/listinfo/general
>_______________________________________________
>General mailing list
>General at xqzone.marklogic.com
>http://xqzone.com/mailman/listinfo/general
>
>.
>
>
>
More information about the General
mailing list