This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Transforming HTML to NITF


Hi Adam,

Here's the solution to the problem:

The input xml doc:
-----------------
<body>
<p> this is some text</p>
<ul>
<li>item 1</li>
</ul>
this is <em>emphasis</em> some more <b>text</b><br/><br/>
<p>This is a new paragraph</p>
</body>

The stylesheet:
--------------
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" omit-xml-declaration="yes"/>
 
 <xsl:template match="body">
  <xsl:copy>
   <xsl:apply-templates select="node()"/>
  </xsl:copy>
 </xsl:template>
	
 <xsl:template 
    match="node()[not(self::p or self::table or self::ul or self::ol or self::body)
	       and not(ancestor::*[self::p or self::table or self::ul or self::ol])]">
  <xsl:choose>
   <xsl:when test="position()=1 or not(preceding-sibling::node()[1]
	                         [not(self::p or self::table or self::ul or self::ol)])">
     <p>
	   <xsl:copy-of select="."/>
	 
	   <xsl:variable name="endOfGroup" 
	    select="(following-sibling::node()[self::p or self::table or self::ul or self::ol])[1]"/>

       <xsl:variable name="endGroupPosition" 
	     select="count($endOfGroup/preceding-sibling::node())"/>		
     
	   <xsl:apply-templates mode="following"
                  select="following-sibling::node()
	            [count(preceding-sibling::node()) &lt; $endGroupPosition]"/>
     </p>
   </xsl:when>
  </xsl:choose>
 </xsl:template>
 
 <xsl:template  mode="following" match="node()">
    <xsl:copy-of select="."/>
 </xsl:template>
 
 <xsl:template  match="node()">
    <xsl:copy-of select="."/>
 </xsl:template>

</xsl:stylesheet> 


The result:
----------
<body><p> this is some text</p><ul>
<li>item 1</li>
</ul><p>
this is <em>emphasis</em> some more <b>text</b><br /><br /></p><p>This is a new
paragraph</p></body>

Hope this helped.

Cheers,
Dimitre Novatchev.

Adam Van Den Hoven wrote:

Since the body of NITF (News Industry Text Format, a standard format for
News content) is alot like HTML (in the simplest form), I'm allowing my
users to create NITF using an HTML parser. I then pass the HTML through HTML
Tidy to make it well formed XML and then through an XSL to make it NITF.

I have come across a problem that I dont know how to fix and I need the
communities help. 

the NITF has a <content.body> tag which is equivilant to HTMLs <body> tag.
However, its children are far more rigidly defined in that it only allows
elements as children. For my purposes, I'm allowed <p> <table> <ul> and <ol>
tags (there are others but we don't use them yet). 

After passing the HTML through HTML Tidy, I might get something like:

<body>
<p> this is some text</p>
<ul>
<li>item 1</li>
</ul>
this is <em>emphasis</em> some more <b>text</b></br></br>
<p>This is a new paragraph</p>
</body>
This would occur if I started with:
<body>
<p> this is some text
<ul>
<li>item 1</li>
</ul>
this is <em>emphasis</em> some more <b>text</b></p>
<p>This is a new paragraph</p>
</body>

> I need to get the line:
this is <em>emphasis</em> some more <b>text</b></br></br>
> to end up wrapped in <p> tags (preferably without the <br>s)
> 
> For clarity, the children of the body are:
     p
     ul 
|    text()	
|    em	
|    text()	
|    b	
|    br	
|    br	
     p	

> I need to work with thos tags that  have the | beside them as a single
> block so that I can wrap the entire thing in a <p> tag. Since I don't know
> the placement or the order or even the frequency of such situations (there
> is no reason why I couldn't have more blocks that need to be grouped
> together). The solution needs to be general. 
> 


__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]