This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

How to efficiently remove "a" nodes with no "b" descendants



Hello everybody,

I would like to find an efficient way to remove elements of type "a"
which have no descendants of type "b" from an XML structure using XSLT
1.0. 

<!--
Here is an example input:
-->
<a>
   <b/>
   <c/>
   <a>
      <a>
         <a>
            <c>
               <b/>
            </c>
         </a>
         <a>
            <c/>
            <a>
               <c/>
            </a>
         </a>
         <a>      
            <b/>
            <c/>
         </a>
         <c/>
      </a>
      <c/>
   </a>
</a>

<!--
This is the desired output:
-->
<?xml version="1.0" encoding="ISO-8859-1"?>
<a>
   <b></b>
   <c></c>
   <a>
      <a>
         <a>
            <c>
               <b></b>
            </c>
         </a>
         
         <a>      
            <b></b>
            <c></c>
         </a>
         <c></c>
      </a>
      <c></c>
   </a>
</a>

<!--
The obvious way to do it would be using a template like 

<xsl:template match="a[not(.//b)]"/>

However, I wouldn't call that efficient. This becomes a real-world
problem when the input file is large.

Here is a DOM program which seems to do what I want. I just can't
figure out how to do it using XSLT 1.0:
-->

package fi.vtt.tte.pipex.ext;

import java.io.IOException;
import java.util.Properties;
import java.util.Vector;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class Compact2 implements DOMExtension
{
    public Document transform(Document doc, Properties p) throws SAXException, IOException
    {
        Vector v = new Vector();
        prepareNode(doc.getDocumentElement(), v);
        while (!v.isEmpty()) {
            Node n;
            n = (Node)v.firstElement();
            n.getParentNode().removeChild(n);
            v.removeElement(n);
        }
        return doc;
    }

    boolean prepareNode(Node n, Vector v)
    {
        boolean include = false;
        Node child;
        for (child = n.getFirstChild(); child != null; child = child.getNextSibling()) {
           boolean b;
           b = prepareNode(child, v) || "b".equals(child.getNodeName());
           if (!b && "a".equals(child.getNodeName())) {
              v.addElement(child);
           }
           include |= b;
        }
        return include;
    }
}

<!--
I'm starting to feel that I'm missing some trivial solution or *maybe*
this is not possible at all (just trying to provoke you a bit... ;-).

Well, I might as well post my current attempt... This probably is
quite broken:
-->

<?xml version="1.0"?> 

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="*|@*" priority="-1">
   <xsl:copy>
     <xsl:apply-templates select="@*|*|text()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="*">
   <xsl:variable name="substr">
      <xsl:apply-templates/>
   </xsl:variable>

   <xsl:variable name="include">
      <xsl:if test="name()='b' or $substr/*/@include='yes'">yes</xsl:if>
   </xsl:variable>

   <xsl:if test="not(name()='a') or $include='yes'">
      <xsl:element name="{name()}">
         <xsl:attribute name="include"><xsl:value-of 
                                    select="$include"/></xsl:attribute>
         <xsl:copy-of select="$substr"/>
      </xsl:element>
   </xsl:if>
</xsl:template>

</xsl:stylesheet>

<!--

Not surprisingly, Xalan says: Can't process Can not convert #UNKNOWN
to a NodeList!

Still there? Thank's a lot for your time,

Teppo

--
Teppo Peltonen <mailto:teppo.peltonen@vtt.fi>     phone 09 4566080
VTT Information Technology                        mobile 040 5651878
Tekniikantie 4 B, P.O.Box 1201, Espoo 02044 VTT   telefax 09 4567052


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]