[MarkLogic Dev General] Is xdml:unquote appropriate for handling accent characters?

Kari Cowan KCowan at alm.com
Thu Feb 9 11:20:39 PST 2017


We’re delivering an RSS feed (so XML).  Unicode will break it.

From: <general-bounces at developer.marklogic.com> on behalf of Christopher Hamlin <cbhamlin at gmail.com>
Reply-To: MarkLogic <general at developer.marklogic.com>
Date: Thursday, February 9, 2017 at 11:19 AM
To: MarkLogic <general at developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] Is xdml:unquote appropriate for handling accent characters?

I still think the goal isn't clear (to me).

What are you actually trying to deliver?  HTML, XML, XHTML?

Why do you want &eacute; instead of the Unicode character?

I'd go back to basics:  break the problem down to steps, define what's needed at each step, and see where it goes wrong.


On Thu, Feb 9, 2017 at 2:05 PM, Kari Cowan <KCowan at alm.com<mailto:KCowan at alm.com>> wrote:
I guess I could do a function with a series of replacements
>> fn:replace($Str,"&eacute;","&amp;eacute;")

I was hoping there was a better way.


From: <general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com>> on behalf of Kari Cowan <KCowan at alm.com<mailto:KCowan at alm.com>>
Reply-To: MarkLogic <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Date: Thursday, February 9, 2017 at 10:53 AM
To: MarkLogic <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Is xdml:unquote appropriate for handling accent characters?

We stored in the doc as:  Pok&eacute;mon

In xQuery I retrieve with
let $theTitle:=$doc//ir:HEADLINE/text()

That returns as:
Pokémon

How can I return it as Pok&eacute;mon instead of Pokémon?




From: <general-bounces at developer.marklogic.com<mailto:general-bounces at developer.marklogic.com>> on behalf of Christopher Hamlin <cbhamlin at gmail.com<mailto:cbhamlin at gmail.com>>
Reply-To: MarkLogic <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Date: Thursday, February 9, 2017 at 9:44 AM
To: MarkLogic <general at developer.marklogic.com<mailto:general at developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] Is xdml:unquote appropriate for handling accent characters?


It's still unclear (to me) what is going on.  Here's some stuff I'd try:

Is the title in the ML db?  If so, it's been parsed and stored as UTF-8.

The query console can be complicated since it is goes through layers and is friendly in its output.

You can check the data by getting dumping out the xml to disk and inspecting via whatever you might use for that.  Just use xdmp:save.  Then you know how things are in the db.

Then, if you are going through an appserver, just do the request and store the result.  Again, check things out on disk.  Look at the headers returned and the payload.  Is the returned 'stuff' OK in an XML editor?

If that's OK, then what is parsing the return and choking, and why is it choking?


_______________________________________________
General mailing list
General at developer.marklogic.com<mailto:General at developer.marklogic.com>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20170209/74a7403c/attachment-0001.html 


More information about the General mailing list