[MarkLogic Dev General] en/em dashes punctuation?

Danny Sokolsky Danny.Sokolsky at marklogic.com
Thu Jan 26 17:35:09 PST 2012


Hi Will,

One thing you can do is change your search grammar to use a joiner other than the negative sign.

Here is the default grammar:

http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/xml/search-dev-guide/search-api.xml%2344520

-Danny

-----Original Message-----
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Will Thompson
Sent: Thursday, January 26, 2012 4:34 PM
To: General MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] en/em dashes punctuation?

Our search autocomplete pulls from doc titles, some of which contain en or em dashes. However, if the dash is "floating"- i.e.: "Venue - Motion to Transfer" - search:parse parses it into the query, even though <term-option>punctuation-insensitive</term-option> is included in the <term> section of the search options node. I thought it may just be getting ignored when it's evaluated but it's definitely limiting the query.

I can confirm they are punctuation: cts:tokenize("hyphen-en-em-bar―")[. instance of cts:punctuation] => "- - - ―"

But is there an exception here (the same way hyphens are always parsed to negate)? Do I just need to remove these from the query string before calling search:parse? If there is a cleaner way, that would be great.


Best,

Will
_______________________________________________
General mailing list
General at developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


More information about the General mailing list