|
|
cts:element-attribute-value-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s)
that match the specified wildcard pattern. Element-attribute value
lexicons are implemented using string range indexes; consequently this
function requires an attribute range index of type xs:string
for each of the element/attribute pairs specified
in the function. If there is not a string range index configured
for any of the specified element/attribute pairs, then an
exception is thrown.
|
Parameters:
$element-names
:
The element QNames.
|
$attribute-names
:
The attribute QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- Specifies a case-sensitive match.
- "case-insensitive"
- Specifies a case-insensitive match.
- "diacritic-sensitive"
- Specifies a diacritic-sensitive match.
- "diacritic-insensitive"
- Specifies a diacritic-insensitive match.
- "ascending"
- Specifies that values should be returned in ascending order.
- "descending"
- Specifies that values should be returned in descending order.
- "any"
- Specifies that values in any fragment should be returned.
- "document"
- Specifies that values in document fragments should be returned.
- "properties"
- Specifies that values in properties fragments should be returned.
- "locks"
- Specifies that values in locks fragments should be returned.
|
$query
(optional):
Only return values that exist in fragments selected by the
cts:query. That is, the values do not need
to match the query, but they must occur in fragments that match
the query.
|
|
Usage Notes:
If multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
Note the following interactions about the options:
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no value filtering is performed. Values may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted values in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-attribute-value-match(xs:QName("animals"),
xs:QName("name"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-values(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:string],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s).
Element-attribute value lexicons are implemented using string range
indexes; consequently this function requires an attribute range index
of type xs:string or xs:anyURI for each of
the element/attribute pairs specified in the function. If there is not
a string range index configured for any of the specified
element/attribute pairs, then an exception is thrown. The values are
returned in collation order.
|
Parameters:
$element-names
:
The element QNames.
|
$attribute-names
:
The attribute QNames.
|
$start
(optional):
A starting value. Return only this value and following values. If
the empty string, return all values. If the parameter is not in
the lexicon, then it returns the values beginning with the next
value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Specifies that values should be returned in ascending order.
- "descending"
- Specifies that values should be returned in descending order.
- "any"
- Specifies that values in any fragment should be returned.
- "document"
- Specifies that values in document fragments should be returned.
- "properties"
- Specifies that values in properties fragments should be returned.
- "locks"
- Specifies that values in locks fragments should be returned.
|
$query
(optional):
Only return values that exist in fragments selected by the
cts:query. That is, the values do not need
to match the query, but they must occur in fragments that match
the query.
|
|
Usage Notes:
If multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no value filtering is performed. Values may be returned that
only appear in deleted fragments not yet expunged by a merge.
O filter deleted values in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-attribute-values(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-attribute-word-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s) that
match a wildcard pattern. This function requires an element-attribute
word lexicon for each of the element/attribute pairs specified in the
function. If there is not an element-attribute word lexicon
configured for any of the specified element/attribute pairs, then
an exception is thrown.
|
Parameters:
$element-names
:
The element QNames.
|
$attribute-names
:
The attribute QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- Specifies a case-sensitive match.
- "case-insensitive"
- Specifies a case-insensitive match.
- "diacritic-sensitive"
- Specifies a diacritic-sensitive match.
- "diacritic-insensitive"
- Specifies a diacritic-insensitive match.
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
If multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
Note the following interactions about the options:
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-word-match(xs:QName("animals"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-words(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:string],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s).
This function requires an element-attribute lexicon for each of the
element/attribute pairs specified in the function. If there is not an
element/attribute word lexicon configured for any of the specified
element/attribute pairs, then an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
The element QNames.
|
$attribute-names
:
The attribute QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If this parameter is the empty string, it returns
all of the words. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
If multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-attribute-words(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-value-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the specified element value lexicon(s)
that match the specified wildcard pattern. Element value lexicons
are implemented using string range indexes; consequently this function
requires an element range index of type xs:string for
each element specified in the function. If there is not a string range
index configured for any of the specified elements, then an exception
is thrown.
|
Parameters:
$element-names
:
The element QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- Specifies a case-sensitive match.
- "case-insensitive"
- Specifies a case-insensitive match.
- "diacritic-sensitive"
- Specifies a diacritic-sensitive match.
- "diacritic-insensitive"
- Specifies a diacritic-insensitive match.
- "ascending"
- Specifies that values should be returned in ascending order.
- "descending"
- Specifies that values should be returned in descending order.
- "any"
- Specifies that values in any fragment should be returned.
- "document"
- Specifies that values in document fragments should be returned.
- "properties"
- Specifies that values in properties fragments should be returned.
- "locks"
- Specifies that values in locks fragments should be returned.
|
$query
(optional):
Only return values that exist in fragments selected by the
cts:query. That is, the values do not need
to match the query, but they must occur in fragments that match
the query.
|
|
Usage Notes:
Note the following interactions about the options:
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no value filtering is performed. Values may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted values in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-value-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-values(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:string],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the specified element value lexicon(s). Element
value lexicons are implemented using string range indexes;
consequently this function requires an element index of type
xs:string or xs:anyURI for each element
specified in the function. If there is not a string range index
configured for any of the specified elements, an exception is thrown.
The values are returned in collation order.
|
Parameters:
$element-names
:
The element QNames.
|
$start
(optional):
A starting value. Return only this value and following values. If
the empty string, return all values. If the parameter is is not in
the lexicon, then it returns the values beginning with the next
value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Specifies that values should be returned in ascending order.
- "descending"
- Specifies that values should be returned in descending order.
- "any"
- Specifies that values in any fragment should be returned.
- "document"
- Specifies that values in document fragments should be returned.
- "properties"
- Specifies that values in properties fragments should be returned.
- "locks"
- Specifies that values in locks fragments should be returned.
|
$query
(optional):
Only return values that exist in fragments selected by the
cts:query. That is, the values do not need
to match the query, but they must occur in fragments that match
the query.
|
|
Usage Notes:
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no value filtering is performed. Values may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted values in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:element-values(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-word-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon(s) that match
a wildcard pattern. This function requires an element word lexicon
configured for each of the specified elements in the function. If there
is not an element word lexicon configured for any of the specified
elements, an exception is thrown.
|
Parameters:
$element-names
:
The element QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- Specifies a case-sensitive match.
- "case-insensitive"
- Specifies a case-insensitive match.
- "diacritic-sensitive"
- Specifies a diacritic-sensitive match.
- "diacritic-insensitive"
- Specifies a diacritic-insensitive match.
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
Note the following interactions about the options:
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:and-query( () )
Only words that can be matched with element-word-query are returned.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-word-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-words(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:string],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon. This function
requires an element lexicon for each of the element specified in the
function. If there is not an element word lexicon configured for any
of the specified elements, an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
The element QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If this parameter is the empty string, it returns
all of the words. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:and-query( () )
Only words that can be matched with element-word-query are returned.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-words(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:highlight(
|
|
$node as node(),
|
|
$query as cts:query,
|
|
$expr as item()*
|
| ) as node() |
|
 |
Summary:
Returns a copy of the node, replacing any text matching the query
with the specified expression. You can use this function
to easily highlight any text found in a query. Unlike
fn:replace and other XQuery string functions that match
literal text, cts:highlight matches every term that
matches the search, including stemmed matches or matches with
different capitalization.
|
Parameters:
$node
:
The node to highlight. The node must be either a document node
or an element node; it cannot be a text node.
|
$query
:
The query specifying the text to highlight.
|
$expr
:
The expression with which to replace each match. You can use the
variables $cts:text, $cts:node, and
$cts:queries (described below)
in the expression.
|
|
Usage Notes:
There are three built-in variables to represent a query match.
These variables can be used inline in the the expression parameter.
$cts:text as xs:string
- the matched text
$cts:node as text()
- the node containing the matched text
$cts:queries as cts:query*
- the matching queries
You cannot use cts:highlight to highlight results from
cts:similar-query and cts:element-attribute-*-query
items. Using cts:highlight with these queries will
return the nodes without any highlighting.
You can also use cts:highlight as a general search
and replace function. The specified expression will replace any matching
text. For example, you could replace the word "hello" with "goodbye"
in a query similar to the following:
cts:highlight($node, "hello", "goodbye")
Because the expressions can be any XQuery expression, they can be very
simple like the above example or they can be extremely complex.
|
Example:
To highlight "MarkLogic" with bold in the following paragraph:
let $x := <p>MarkLogic Server is an enterprise-class
database specifically built for content.</p>
return
cts:highlight($x, "MarkLogic", <b>{$cts:text}</b>)
Returns:
<p><b>MarkLogic</b> Server is an enterprise-class
database specifically built for content.</p>
|
Example:
Given the following document with the URI "hellogoodbye.xml":
<root>
<a>It starts with hello and ends with goodbye.</a>
</root>
The following query will highlight the word "hello" in
blue, and everything else in red.
cts:highlight(doc("hellogoodbye.xml"),
cts:and-query((cts:word-query("hello"),
cts:word-query("goodbye"))),
if ( cts:word-query-text($cts:queries) eq "hello" )
then ( <font color="blue">{$cts:text}</font> )
else ( <font color="red">{$cts:text}</font> )
)
returns:
<root>
<a>It starts with <font color="blue">hello</font>
and ends with <font color="red">goodbye</font>.</a>
</root>
|
Example:
for $x in cts:search(collection(), "MarkLogic")
return
cts:highlight($x, "MarkLogic", <b>{$cts:text}</b>)
returns all of the nodes that contain "MarkLogic",
placing bold markup around the matched words.
|
|
|
|
cts:remainder(
|
|
[$node as node()]
|
| ) as xs:integer |
|
 |
Summary:
Returns an estimated search result size for a node,
or of the context node if no node is provided.
The search result size for a node is the number of fragments remaining
(including the current node) in the result sequence containing the node.
This is useful to quickly estimate the size of a search result sequence,
without using fn:count() or xdmp:estimate().
|
Parameters:
$node
(optional):
The node. Typically this is an item in the result sequence of a
cts:search operation. If you specify the first item
from a cts:search expression,
then cts:remainder will return an estimate of the number
of fragments that match that expression.
|
|
Usage Notes:
This function makes it efficient to estimate the size of a search result
and execute that search in the same query. If you only need an estimate of
the size of a search but do not need to run the search, then
xdmp:estimate is more efficient.
To return the estimated size of a search with cts:remainder,
use the first item of a cts:search result sequence as the
parameter to cts:remainder. For example, the following
query returns the estimated number of fragments that contain the word
"dog":
cts:remainder(cts:search(collection(), "dog")[1])
When you put the position predicate on the cts:search result
sequence, MarkLogic Server will filter all of the false positives up to the
specified position, but not the false positives beyond the specified
position. Because of this, when you increase the position number in the
parameter, the result from cts:remainder might decrease
by a larger number than the increase in position number, or it might not
decrease at all. For example, if
the query above returned 10, then the following query might return 9, it
might return 10, or it might return less than 9, depending on how the
results are dispersed throughout different fragments:
cts:remainder(cts:search(collection(), "dog")[2])
If you run cts:remainder on a constructed node, it always
returns 0; it is primarily intended to run on nodes that are the retrieved
from the database (an item from a cts:search result or an
item from the result of an XPath expression that searches through the
database).
|
Example:
let $x := cts:search(collection(), "dog")
return
( cts:remainder($x[1]), $x )
=> Returns the estimated number of items in the search
for "dog" followed by the results of the search.
|
Example:
xdmp:document-insert("/test.xml", <a>my test</a>);
for $x in cts:search(collection(),"my test")
return cts:remainder($x) => 1
|
Example:
for $a in cts:search(collection(),"my test")
where $a[cts:remainder() eq 1]
return base-uri($a) => /test.xml
|
|
|
|
cts:search(
|
|
$expression as node()*,
|
|
$query as cts:query?,
|
|
[$options as xs:string*],
|
|
[$quality-weight as xs:double]
|
| ) as node()* |
|
 |
Summary:
Returns a relevance-ordered sequence of nodes specified by a given query.
|
Parameters:
$expression
:
The expression to be searched.
This must be an inline fully searchable path expression.
|
$query
:
The query specifying search to perform.
|
$options
(optional):
The options to this search. The default is ().
Options include:
"filtered"
Specifies a filtered search (the default). Filtered searches
eliminate any false positives and properly resolve cases where
there are
multiple candidate matches within the same fragment, thereby
guaranteeing that the results fully satisfy the specified
cts:query.
"unfiltered"
Specifies an unfiltered search. An unfiltered search
selects fragments from the indexes that are candidates to satisfy
the specified cts:query, and then it returns
a single node from within each fragment that satisfies the specified
searchable path expression. Unfiltered searches are useful because
of the performance they afford when jumping deep into the
result set (for example, when paginating a long result set and
jumping to the 1,000,000th result). However, depending on the
searchable path expression, the
cts:query specified, the structure of the documents in
the database, and the configuration of the database, unfiltered
searches may result in false positives being included in the
search results. Unfiltered searches may also result in missed
matches or in incorrect matches, especially when there are
multiple candidate matches within a single fragment.
To avoid these problems, you should only use unfiltered searches
on top-level XPath expressions (for example, document nodes,
collections, directories) or on fragment roots. Using unfiltered
searches on complex XPath expressions or on XPath expressions that
traverse below a fragment root can result in unexpected results.
"score-logtfidf"
Compute scores using the logtfidf method (the default scoring
method). This uses the formula:
log(term frequency) * (inverse document frequency)
"score-logtf"
Compute scores using the logtf method. This does not take into
account how many documents have the term and uses the formula:
log(term frequency)
"score-simple"
Compute scores using the simple method. The score-simple
method gives a score of 8*weight for each matching term in the
cts:query expression. It does not matter how
many times a given term matches (that is, the term
frequency does not matter); each match contributes 8*weight
to the score. For example, the following query (assume the
default weight of 1) would give a score of 8 for
any fragment with one or more matches for "hello", a score of 16
for any fragment that also has one or more matches for "goodbye",
or a score of zero for fragments that have no matches for
either term:
cts:or-query(("hello", "goodbye"))
|
$quality-weight
(optional):
The document quality weight to use when computing scores.
The default is 1.0.
|
|
Usage Notes:
Queries that use cts:search require that the XPath expression
searched is fully searchable. A fully searchable path is one that
has no steps that are unsearchable. You can use the
xdmp:query-trace() function to see if the path is fully
searchable. If there are no entries in the xdmp:query-trace()
output indicating that a step is unsearchable, then that path is fully
searchable. Queries that use cts:search on unsearchable
XPath expressions will fail with an an error message. You can often make
the path expressions fully searchable by rewriting the query or adding
new indexes.
Each node that cts:search returns has a score with which
it is associated. To access the score, use the cts:score
function. The nodes are returned in relevance order (most relevant to least
relevant), where more relevant nodes have a higher score.
If the options parameter contains neither "filtered" nor "unfiltered",
then the default is "filtered".
If the options parameter contains none of "score-logtfidf", "score-logtf",
or "score-simple", then the default is "score-logtfidf".
If the cts:query specified is the empty string (equivalent
to cts:word-query("")), then the search returns the empty
sequence.
|
Example:
cts:search(//SPEECH,
cts:word-query("with flowers"))
=> ..sequence of 'SPEECH' element ancestors (or self)
of any node containing the phrase 'with flowers'.
|
|
|
|
cts:tokenize(
|
|
$text as xs:string
|
| ) as cts:token* |
|
 |
Summary:
Tokenizes text into words, punctuation, and spaces. Returns output in
the type cts:token, which has subtypes
cts:word, cts:punctuation, and
cts:space, all of which are subtypes of
xs:string.
|
Parameters:
$text
:
The word or phrase to tokenize.
|
|
Usage Notes:
When you tokenize a string with cts:tokenize, each word is
represented by an instance of
cts:word, each set of adjacent punctuation characters
is represented by an instance of cts:punctuation,
each set of adjacent spaces is represented by an instance of
cts:space, and each set of adjacent line breaks
is represented by an instance of cts:space.
Unlike the standard XQuery function fn:tokenize,
cts:tokenize returns words, punctuation, and spaces
as different types. You can therefore use a typeswitch to handle each type
differently. For example, you can use cts:tokenize to remove
all punctuation from a string, or create logic to test for the type and
return different things for different types, as shown in the first
two examples below.
You can use xdmp:describe to show how a given string will be
tokenized. When run on the results of cts:tokenize, the
xdmp:describe function returns the types and the values
for each token. For a sample of this pattern, see the third example below.
|
Example:
(: Remove all punctuation :)
let $string := "The red, blue, green, and orange
balloons were launched!"
let $noPunctuation :=
for $token in cts:tokenize($string)
return
typeswitch ($token)
case $token as cts:punctuation return ""
case $token as cts:word return $token
case $token as cts:space return $token
default return ()
return string-join($noPunctuation, "")
=> The red blue green and orange
balloons were launched
|
Example:
(: Insert the string "XX" before and after
all punctuation tokens :)
let $string := "The red, blue, green, and orange
balloons were launched!"
let $tokens := cts:tokenize($string)
return string-join(
for $x in $tokens
return if ( $x instance of cts:punctuation )
then ( concat("XX",
$x, "XX") )
else ( $x ) , "")
=> The redXX,XX blueXX,XX greenXX,XX and orange
balloons were launchedXX!XX
|
Example:
(: show the types and tokens for a string :)
xdmp:describe(cts:tokenize("blue, green"))
=> (cts:word("blue"), cts:punctuation(","),
cts:space(" "), cts:word("green"))
|
|
|
|
cts:word-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon that match the wildcard pattern.
This function requires the word lexicon to be enabled. If the word
lexicon is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
The wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- Specifies a case-sensitive match.
- "case-insensitive"
- Specifies a case-insensitive match.
- "diacritic-sensitive"
- Specifies a diacritic-sensitive match.
- "diacritic-insensitive"
- Specifies a diacritic-insensitive match.
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be
returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
Note the following interactions about the options:
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:and-query( () )
|
Example:
cts:word-match("aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:words(
|
|
[$start as xs:string],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon. This function requires the word
lexicon to be enabled. If the word lexicon is not enabled, an
exception is thrown. The words are returned in collation order.
|
Parameters:
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If this parameter is the empty string, it returns
all of the words. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Specifies that words should be returned in ascending order.
- "descending"
- Specifies that words should be returned in descending order.
- "any"
- Specifies that words in any fragment should be returned.
- "document"
- Specifies that words in document fragments should be returned.
- "properties"
- Specifies that words in properties fragments should be
returned.
- "locks"
- Specifies that words in locks fragments should be returned.
|
$query
(optional):
Only return words that exist in fragments selected by the
cts:query. That is, the words do not need
to match the query, but the words must occur in fragments that match
the query.
|
|
Usage Notes:
If the options parameter contains neither "ascending" nor "descending",
then the default is "ascending".
At most one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If the options parameter contains none of "any", "document",
"properties", or "locks", then the default is "any".
If the $query parameter is not present, and the current user is assigned
the admin role, no word filtering is performed. Words may be returned that
only appear in deleted fragments not yet expunged by a merge.
To filter deleted words in this case, use a $query parameter.
For example:
cts:words("abc", (), cts:and-query( () ) )
|
Example:
cts:words("aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|