[MarkLogic Dev General] en/em dashes punctuation?
Danny.Sokolsky at marklogic.com
Thu Jan 26 17:35:09 PST 2012
One thing you can do is change your search grammar to use a joiner other than the negative sign.
Here is the default grammar:
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Will Thompson
Sent: Thursday, January 26, 2012 4:34 PM
To: General MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] en/em dashes punctuation?
Our search autocomplete pulls from doc titles, some of which contain en or em dashes. However, if the dash is "floating"- i.e.: "Venue - Motion to Transfer" - search:parse parses it into the query, even though <term-option>punctuation-insensitive</term-option> is included in the <term> section of the search options node. I thought it may just be getting ignored when it's evaluated but it's definitely limiting the query.
I can confirm they are punctuation: cts:tokenize("hyphen-en-em-bar―")[. instance of cts:punctuation] => "- - - ―"
But is there an exception here (the same way hyphens are always parsed to negate)? Do I just need to remove these from the query string before calling search:parse? If there is a cleaner way, that would be great.
General mailing list
General at developer.marklogic.com
More information about the General