This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: illegal elements must go...


Jukka,

>I'm having parent element which can have many childrens (in source). But the
>result side (in DTD) there are fewer possible elements. So, solution I'm
>gonna do is to put those 'illegal' (in result side) elements childrens of
>para element(s).

You can think of this as a grouping problem and apply a grouping solution
to it.  You're grouping all the illegal elements and text within a 'para'
element.  As with all grouping problems, you have to ask yourself what is
unique about these particular nodes that puts them in a group with each other?

The answer in cases like this is the identity of some following node.  In
your example:

<entry>
  <para>sometext</para>
  <jibii>sometext</jibiii>
  sometext without tags - illegal
  <zzz>illegal element in result</zzz>
  <xxx>another illegal elem</xxx>
  <jibii>this elem is good</jibii>
  <xxx>another illegal</xxx>
</entry>

'sometext without tags - illegal' and the following 'zzz' and 'xxx'
elements all have the same preceding legal element
(<jibii>sometext</jibiii>) and the same following legal element
(<jibii>this elem is good</jibii>).  So you can use this fact to group the
nodes together.

As usual I'll use the Muenchian Method and define a key:

<xsl:key name="illegal-nodes"
         match="xxx | zzz | entry/text()[normalize-space(.)]"
         use="generate-id(following-sibling::*[name() = 'para' or
                                               name() = 'jibii'])" />

The key matches on the illegal nodes that you know about - change this
expression to match any illegal nodes - note that I've selected only that
text that actually has some non-whitespace content.  The key uses as a
value the unique id of the first legal element (a 'para' or a 'jibii') that
follows the matched illegal node.  You could put something more complex
there in order to match other legal nodes.

Thus, within a template that matches on a legal element, you can use:

  <xsl:variable name="preceding-illegal-nodes"
                select="key('illegal-nodes', generate-id())" />
  <xsl:if test="$preceding-illegal-nodes">
    <para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
  </xsl:if>

The variable $preceding-illegal-nodes holds the illegal nodes that precede
the current legal element, identifying them through it's unique identifier.
 If there are such nodes, a copy of them is placed within a 'para' element.

You also need to make sure to copy any illegal nodes that come at the end
of the entry, so within the 'entry'-matching template similarly have:

    <xsl:variable name="ending-illegal-nodes"
                  select="key('illegal-nodes', '')" />
    <xsl:if test="$ending-illegal-nodes">
      <para><xsl:copy-of select="$ending-illegal-nodes" /></para>
    </xsl:if>

The key value of '' gets all those nodes that were given a key value that
was the result of calling generate-id() on an empty node set.

With those in place, you just need to be sure that you're only applying
templates to the legal nodes.  The final stylesheet is:

----
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:key name="illegal-nodes"
         match="xxx | zzz | entry/text()[normalize-space(.)]"
         use="generate-id(following-sibling::*[name() = 'para' or
                                               name() = 'jibii'])" />

<xsl:template match="entry">
  <entry>
    <xsl:apply-templates select="para | jibii" />
    <xsl:variable name="ending-illegal-nodes"
                  select="key('illegal-nodes', '')" />
    <xsl:if test="$ending-illegal-nodes">
      <para><xsl:copy-of select="$ending-illegal-nodes" /></para>
    </xsl:if>
  </entry>
</xsl:template>

<xsl:template match="*">
  <xsl:variable name="preceding-illegal-nodes"
                select="key('illegal-nodes', generate-id())" />
  <xsl:if test="$preceding-illegal-nodes">
    <para><xsl:copy-of select="$preceding-illegal-nodes" /></para>
  </xsl:if>
  <xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>
----

This works in SAXON.  It doesn't work in Xalan, which ignores the illegal
nodes that occur at the end of the 'entry': either Xalan doesn't produce an
empty string when generate-id() is called on an empty node set, or it
doesn't like having key values that are empty strings.

I think that this solution sits fairly well alongside the other solutions
that were proposed.

Cheers,

Jeni



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]