This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: illegal elements must go...
- To: Jukka dot T dot Lehtinen at nokia dot com
- Subject: Re: illegal elements must go...
- From: Jeni Tennison <mail at jenitennison dot com>
- Date: Fri, 25 Aug 2000 12:53:16 +0100
- Cc: xsl-list at mulberrytech dot com
- Reply-To: xsl-list at mulberrytech dot com
Jukka,
>I'm having parent element which can have many childrens (in source). But the
>result side (in DTD) there are fewer possible elements. So, solution I'm
>gonna do is to put those 'illegal' (in result side) elements childrens of
>para element(s).
You can think of this as a grouping problem and apply a grouping solution
to it. You're grouping all the illegal elements and text within a 'para'
element. As with all grouping problems, you have to ask yourself what is
unique about these particular nodes that puts them in a group with each other?
The answer in cases like this is the identity of some following node. In
your example:
<entry>
<para>sometext</para>
<jibii>sometext</jibiii>
sometext without tags - illegal
<zzz>illegal element in result</zzz>
<xxx>another illegal elem</xxx>
<jibii>this elem is good</jibii>
<xxx>another illegal</xxx>
</entry>
'sometext without tags - illegal' and the following 'zzz' and 'xxx'
elements all have the same preceding legal element
(<jibii>sometext</jibiii>) and the same following legal element
(<jibii>this elem is good</jibii>). So you can use this fact to group the
nodes together.
As usual I'll use the Muenchian Method and define a key:
<xsl:key name="illegal-nodes"
match="xxx | zzz | entry/text()[normalize-space(.)]"
use="generate-id(following-sibling::*[name() = 'para' or
name() = 'jibii'])" />
The key matches on the illegal nodes that you know about - change this
expression to match any illegal nodes - note that I've selected only that
text that actually has some non-whitespace content. The key uses as a
value the unique id of the first legal element (a 'para' or a 'jibii') that
follows the matched illegal node. You could put something more complex
there in order to match other legal nodes.
Thus, within a template that matches on a legal element, you can use:
<xsl:variable name="preceding-illegal-nodes"
select="key('illegal-nodes', generate-id())" />
<xsl:if test="$preceding-illegal-nodes">
<para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
</xsl:if>
The variable $preceding-illegal-nodes holds the illegal nodes that precede
the current legal element, identifying them through it's unique identifier.
If there are such nodes, a copy of them is placed within a 'para' element.
You also need to make sure to copy any illegal nodes that come at the end
of the entry, so within the 'entry'-matching template similarly have:
<xsl:variable name="ending-illegal-nodes"
select="key('illegal-nodes', '')" />
<xsl:if test="$ending-illegal-nodes">
<para><xsl:copy-of select="$ending-illegal-nodes" /></para>
</xsl:if>
The key value of '' gets all those nodes that were given a key value that
was the result of calling generate-id() on an empty node set.
With those in place, you just need to be sure that you're only applying
templates to the legal nodes. The final stylesheet is:
----
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:key name="illegal-nodes"
match="xxx | zzz | entry/text()[normalize-space(.)]"
use="generate-id(following-sibling::*[name() = 'para' or
name() = 'jibii'])" />
<xsl:template match="entry">
<entry>
<xsl:apply-templates select="para | jibii" />
<xsl:variable name="ending-illegal-nodes"
select="key('illegal-nodes', '')" />
<xsl:if test="$ending-illegal-nodes">
<para><xsl:copy-of select="$ending-illegal-nodes" /></para>
</xsl:if>
</entry>
</xsl:template>
<xsl:template match="*">
<xsl:variable name="preceding-illegal-nodes"
select="key('illegal-nodes', generate-id())" />
<xsl:if test="$preceding-illegal-nodes">
<para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
</xsl:if>
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
----
This works in SAXON. It doesn't work in Xalan, which ignores the illegal
nodes that occur at the end of the 'entry': either Xalan doesn't produce an
empty string when generate-id() is called on an empty node set, or it
doesn't like having key values that are empty strings.
I think that this solution sits fairly well alongside the other solutions
that were proposed.
Cheers,
Jeni
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list