[MarkLogic Dev General] Trailing Spaces Removed from Attribute Values--Bug or Feature?

Eliot Kimber ekimber at reallysi.com
Tue Mar 11 06:50:19 PST 2008


Mike Sokolov wrote:
>   Agreed; however it's not clear that trailing whitespace needs to be 
> preserved in order to be able to search for DITA tokens, as in the 
> original example.  I guess it might depend on just what the tokens 
> consist of but a word- or phrase-search might be able to make use of the 
> implicit tokenization done by the indexer without the need for the 
> trailing whitespace.
> 
> EG:  cts:attribute-word-search(..."topic/topic") ought to match 
> "topic/topic" and not match "mytopic/topic-foo", I think.
>

It's not just a question of what will work from a MarkLogic query but 
what consumers of the elements brought out of MarkLogic will get. For 
example, the XSLT pattern for processing DITA content is:

<xsl:template match="*[contains(@class, ' topic/topic ')]">

If I get stuff out of MarkLogic and hand it to an XSLT transform (e.g., 
the DITA Open Toolkit) then the above match would fail for generic 
topics (because the literal value of class= would be "- topic/topic" not 
"- topic/topic ").

Likewise, editors and other tools that expect the trailing space in 
order to bind behavior to elements would fail.

So even in the best case it would be necessary to moderate any element 
extraction through a filter that either removes the class= attributes 
entirely (falling back on the schema- or DTD-defined defaults, assuming 
the DTD or schema association is restored or maintained in the result) 
or that adds the missing trailing space to the literal class= values in 
the instance.

Cheers,

Eliot

-- 
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com


More information about the General mailing list