[MarkLogic Dev General] Script Detection

Michael Blakeley mike at blakeley.com
Wed Jun 13 11:50:24 PDT 2012


I believe the detection uses ICU, if that helps. http://userguide.icu-project.org/conversion/detection has the original documentation.

If you have a sample document that illustrates the problem, you could try xdmp:encoding-language-detect and see what happens.

-- Mike

On 13 Jun 2012, at 11:26 , bek wrote:

> Can anyone speak about how well script detection works with full width characters.  I can understand detecting a shift to Russian or even a shift to Korean, but how about the characters that reproduce ASCii 21 to 7E in fixed width form.  Are these detected by Mark Logic as script shifts?
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://community.marklogic.com/mailman/listinfo/general
> 



More information about the General mailing list