This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
RE: string question
- To: <xsl-list at lists dot mulberrytech dot com>
- Subject: RE: [xsl] string question
- From: "Michael Kay" <mhkay at iclway dot co dot uk>
- Date: Wed, 21 Feb 2001 14:13:46 -0000
- Reply-To: xsl-list at lists dot mulberrytech dot com
> >From this xml (those IP persons might recognize this):
> <b310>08/773,384</b310>
>
> I need:
> PR_08/773,384 PR_08 PR_773 PR_384
>
> Basically, all punctuation (and spaces) are "segment"
> delimiters. Starting
> with entire content, each segment is then prefixed with PR_.
For this sort of tokenizing you need to write recursive templates.
Something like:
<xsl:template match="b310">
<xsl:value-of select="concat('PR_', ., ' ')"/>
<xsl:call-template name="do-segments"/>
<xsl:with-param name="s" select="concat(normalize-space(
translate('/_:;,~', ' ')), ' ')"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="do-segments">
<xsl:param name="s"/>
<xsl:if test="$s">
<xsl:value-of select="concat('PR_', substring-before(.,' '), ' ')"/>
<xsl:call-template name="do-segments"/>
<xsl:with-param name="s" select="substring-after(.,' ')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
What this does is first, to replace all delimiters by single spaces and add
a single trailing space. Then call a template that outputs the first token,
and calls itself to to process any remaining tokens, terminating when
supplied with an empty string.
If you're in a hurry, some processors have an extension function such as
saxon:tokenize().
Mike Kay
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list