This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Escaping a utf-8 string


On Sat, Aug 17, 2002 at 02:05:02PM -0600, Mike Brown wrote:
> Wesley W. Terpstra wrote:
> > Again, all would be simple if only I could extract a hexademincal encoding
> > of the utf-8 string in some way. Getting the ascii numeric value would work,
> > or in fact, a function like "hexify" or something that takes the string and
> > just output AB87B3AF873E or something would be fine since I could then stick
> > =s and %s in the stream appropriately.
> > 
> > Does anyone know of a way to get past this barrier in xsl?
> 
> This isn't the solution to your problem, but it might give you some ideas:
> 
>   http://skew.org/xml/stylesheets/url-encode/

Yes, I have seen this trick.

> It essentially uses string-length() and substring-before() to find the
> index of a character in a string, then applies a little arithmetic on
> that number to arrive at character indices in another string to produce
> the output.

Unfortunately, this does not support anything other than low bit ascii.
I want to support the full utf-8 space.

> If your goal is to produce UTF-8 sequences, it will be a little trickier,
> and you are going to have to decide what reasonable subset of Unicode is
> worth supporting in this manner, but it should be possible in pure XSLT.

Indeed; however, finding the string offset is not a solution at all.
This takes too long for a large amount of characters.

I have considered using a binary search to find the char, but since there is
no "unicode numeric code -> string" function in xsl either that I have
found, this won't work either. If such a function did exist, this would be
relatively speedy.

Also, then I would have to deal with the annoying unicode number -> utf-8
encoding -> hex myself, but this is ok I suppose.

-- 
Wesley W. Terpstra <wesley@terpstra.ca>

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]