This is the mail archive of the xsl-list@mulberrytech.com mailing list .
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Re: Future XSLT expansion. ( Re: Microsoft XSL and Conformance )

To: xsl-list at mulberrytech dot com
Subject: Re: Future XSLT expansion. ( Re: Microsoft XSL and Conformance )
From: Paul Tchistopolskii <paul at qub dot com>
Date: Fri, 17 Mar 2000 15:13:18 -0800
Organization: The Qub Group
References: <NBBBJPGDLPIHJGEHAKBAKEGGFBAA.martind@netfolder.com>
Reply-To: xsl-list at mulberrytech dot com

 Hi Didier,

> Paul said:
> The last by not the least.
> 
> The functionality you are asking for:
> 
> document('http://www.moreover.com/parameters-here')/url
> 
> Equals to:
> 
> $var = document('http://www.moreover.com/parameters-here');
> select $var/url
> 
> This could be already done with XSLT without polluting the semantics
> of document() function.
> 
> Didier replies:
> What? you tell me that using the document function is polluting the
> semantics of the document function??? 

Yes I think document() function in the form document()/something/here 
( and in the form document() function itself ) has polluting the 'core' sematics 
of  'right' document()  function and I will now explain it in very much detail 
below.
 
> come on Paul, take a walk, read again the specs or do something. 

Why should I ? I think I understand what I'm saying, I also understand 
what you are saying, and I understand what specs are saying.

I tried to explain why hacks in the core are wrong and why  it is better 
to have reasonable core.  And below I will once again  explain why 
document()  has a 'polluted' 'perlish' semantics.

> I simply use the document function as it is
> stated in the specs. The document function takes as parameter a URL and
> returns a node list. bottom line. The specs does not restrict to usage of
> the document function to certain URLs. Moreover, the document function is a
> valid construct to be included in an XPath step because it is part of
> XPath!!! doesn't it? Please, take a look again at the XPath specs. An XPath
> expression such as document(..)/elementx/elementy is a valid XPath
> expression. doesn't it? 

Right, right. It is valid. It just works for documents and only documents + 
it is not scalable. See below.

> And saying that
> 
> $var = document('http://www.moreover.com/parameters-here');
> select $var/url
> 
> is better than
> document('http://www.moreover.com/parameters-here')/url
> 
> is a matter of personal taste and programming style. 

No it is not. The first construction is scalable, the second is not. 

> I wouldn't say that the
> former is better than the latter or vise versa. They are simply equivalent.

Yes, they are. If not thinking about scalability and some other things.

> Paul, take a good walk, thing about it and find a better argument this time.

My arguments  are good.  I suppose,  there is some 
language problem. I always suppose language problems when 
I need to repeat some things.

> Moreover, the construct
 
> $var = document('http://www.moreover.com/parameters-here');
> select $var/url
> 
> is not a valid XSLT construct at all Paul. 

Here comes the promised detailed explanation.

Yes, you are right. Instead of readable pseudocode code above I should actualy 
write the verbose:

<xsl:variable name="var" select="document('b.xml')"/>
<xsl:value-of select="$var/url" />

That means it could easliy be:

<xsl:variable name="var" select="extension:database('query')"/>
<xsl:value-of select="$var/url" />

Or:

<xsl:variable name="var" select="extension:tree-fragment('id')"/>
<xsl:value-of select="$var/url" />

*But*

That means that extension:tree-fragment('id') and 
extension:database('query')

should return *node-sets*  ( remember, node-set type-cast is *not* 
in the core of XSLT ).

That means extensions should be by definition *engine-specific*.

Once again. 

document() function is a logical hack, but not a reasonable 
extension.

Now what is the way it should be wih document(URI);
---------------------------------------------------------------------------

document(URI) ( like any other 'grabber-of-data' )  should return text 
but not node-set,  but node-set  typecast  should be in the XSLT core.

1. This allows user  to import some XML document *not* converting 
it into node-set ( not the way current 'polluted' document() works) 
BTW, could you achive this functionality with XSLT ? Just insert some 
external document into the output ? Why should everything  be 
always  'converted'  to node-sets, 'processed'  an then turned 
back into text? I guess the only answer is 'because we see it this way'.
Good for you,  but limiting.

2. If user wants to achive the current document() -> nodeset conversion, 
she should use the node-set typecast explicitly.

3. Such design allows easy writing engine-independed extensions
( passing text to/from extension,  but not node-set's)

Isn't it obvious that the way taken by XSLT document() is perlish ?
Still not ?
 
> So, please, have more rigueur in
> your demonstration at least. If one of my graduate students where as sloppy
> in their demonstration when I was teaching They probably wouldn't have
> graduated. Please don't mix and match things. A bit of rigueur please.

You are right. If being your student I think I would not graduate. 
The nice thing is that I'm not your student and probably this gives me a chance
to explain you why  document()  is logical hack and why it was better 
not to take perl-ish way here. Of course, if you prefere 'send/recv + read/write +
sysread/syswrite' in a core, probably you will not understand me. 
 
> Paul said:
> I think :
> 
> document() is simply a logical hack  ( like most other solutions
> from XML world, the hack is nice and handy, of course ).
> 
> What it *realy* is:
> 
> a. "get data"
> b. "convert data to nodeset".
> 
> (a) could be text file, identified by  URL, but it could also be any other
> source.
> 
> Actualy, XSLT transformation couild be already invoked over DOM. In this
> case what if I have *2* DOMs ?  Do I understand right that having
> multiple input files is 'legal', but having multiple DOM's is 'marginal' ?
> 
> Didier replies:
> Yes XSLT transformation can be invoked over the DOM. So far so good. What do
> you mean by "in this case if we have two DOMs" and how is this related to
> the actual demonstration? 

> Sorry, I do not understand where you are going.
> And yes you are right having multiple input file is "legal". We cannot have
> two DOMs, the DOM is only the interface, Probably you mean here two document
> trees - if that is the case, no, it is not marginal to have more than one
> document tree - but there is only one output tree (at least in accordance to
> the recommendations - you may have more than one output tree with
> proprietary extensions). You should give first the context on how the
> multiple DOMs are created. But using the document() function _may_ result in
> the construction of a DOM tree, this latter to be included in the already
> present DOM tree (the actual XML document being processed).

Right. I should be careful with each my word because it could be used 
against me.

My question is: do you see any way to utilize document()/part
hack when you have multiple document trees / sources, or document()/part
hack could be utilized only when processing multiple XML files ( something 
identified by URL ) ? The *real* semantics of document()/part is:

1. Give me some data from the outher space.
2. Filter that data with some criteria.

The 'outher space' is considered to be XML file and only XML file + it should 
be *nodeset and only nodeset*. 

This is not scalable and very limiting. Not all the data in this world 
are XSLT vendor-specific node-sets. 

Why should SQL engine bother itself with conversions of some table 
into 'one-root-nodeset-alike' structure ( XSLT vendor specific nodeset 
structure).

It is XSLT paranoya to see everything in this world as 
XSLT-vendor-specific-node-sets. 

XSLT is better to solve it's paranoya by itself. ( placing  node-set
typecast into the core , but if making that step - what is the purpose of 
having current document() semantics? No purpose, instead of 
perlish 'send/recv and read/write'  e t.c. ) 

I wish it is now clear.

> Paul said:
> (b) is not in XSLT standart.  *that's* a problem.
> 
> Didier replies:
> Oh yea? thanks Paul I didn't knew :-)))

You are welcome. Because I have an impression that you don't 
understand what that realy means I have explained it once again 
in more detail. I'm sorry if that was disturbing.
 
> Here is an extract from the W3C recommendations:
> This section describes XSLT-specific additions to the core XPath function
> library. Some of these additional functions also make use of information
> specified by top-level elements in the stylesheet; this section also
> describes these elements.
> 
> 12.1 Multiple Source Documents
> Function: node-set document(object, node-set?)
> 
> The document function allows access to XML documents other than the main
> source document.

Yes. That's what I'm saying. 

'documents other than the main source document'. This assumes the world is 
full of XML documents and only XML documents (XML files) are to be processed.

That's just funny. That's what I was trying to explain.

> [....] and so on and so forth...
> 
> It is an XPath function. More precisely, an XPath expression which is XSLT
> specific. According to the XPath and XSLT recommendations, it is valid to
> use an XPath document() function as a step in an XPath expression. Tsss,
> tsss, here is your homework Paul. Think and answer to these questions:

Of course it is valid. It is just ugly. I'm not saying 'send/recv' are illegal in perl.
My point is that perl is a monster. And also my point is that with document()
hack XSLT choosed perl-ish way.

> a) When an XPath engine process an XPath expression what the engine is
> doing?
> b) How the output of processing a particular step is fed into the input of
> the next step? Do you know any analogy of this process in the Unix world?
> c) What are the XPath composition rules?

Don't uderstand your point. Maybe you will explain your idea instead 
of asking abstract questions ? 

Core XPath engine type is vendor-specific 'node-set' and there is no 
type-cast to that type in the core. That's just funny, you don't see a problem here.
That  *is* a problem. A  *real*design*problem*, forcing logical hacks ( like document()
hack )

> > a) the notion of web service: you do a URL request and get back an answer
> in
> > XML. Do you find that a monster?
> 
> Paul said:
> No. When saying  'monster' I mean that instead of elegant and expandable
> core layer + some utility functions built on top of the layer, we are
> receiving
> 'handy hacks' in the core layer.
> 
> Didier replies:
> So, you find the actual XPath and XSLT recommendations to contain some
> monstrous parts and get "handy hacks" in the core layer. I cannot comment on
> this. This is a matter of taste and taste is undisputable.

So what is the point of your advices to read the specs e t.c. ?

I said: " this 'document()/part hack  is ugly: that's why".

You wrote a lot of stuff in return and finaly you are saying: " I'm 
not discussing that because this is a matter of taste" ????

 > Paul said:
> <monster_example>

...

> </monster_example>
> 
> Perl is monster. Tcl / Tk was not. ( Even I don't like Tcl language
> itself ).
> 
> Didier replies:
> Good point.

It is *so* strange to read this. I lost my mind.

I'm making *exactly* the same argument about the hacking nature of 
document()/part function in XSLT !

It is like bringing 'send/recv' into the core, saying that "for multiple input files 
we have document(), but for multiple tree fragments we have .... ( nothing yet ;-)"

And all that stuff get's into the core. This is perlish, and document() hack 
is a plain example of 'send/recv' in the core!
 
> Paul said:
> I just said that  XSLT's 'document()' function better not to  be polluted in
> the way  you are suggesting ;-) But if it will be - I'l not cry also. 

Ah! This statement was unclear. Yes, it is already polluted. Too long to explain 
why I have used the wording above, but the wording is definately misleading. Sorry.

> I could live without or without that 'document' hack. I'm not using 'dcument' with
> XSLT  at all. With PXSLServlet I'm bringing everything I need into one and
> only one and always one XML document - and then XSLT does it's job == rendering.
> 
> Didier replies:
> Let's reset the clock to the same time here. Why are you saying that I am
> polluting the specs when I am simply using a totally valid XPath expression.

*You* are not polliting the specs. document() is already polluted.

> Furthermore, using it within the boundaries of the XSLT/XPath 1.0
> recommendations. Please Paul, take some time, read again both the XPath and
> XSLT recommendations, buy one or two good book on the subject (I can
> recommend a good one but I would be in a certain conflict of interests
> situation :-)) And thing again about what you just said.

I suggest you will think again about what I'm saying.

> Didier said:
> > b) the notion of content aggregation. If a posted document is an XML
> > document, a fragment of this document can be aggregated by an other
> document
> > using an XPath expression (i.e using the document function as an XPath
> > step).
> 
> Paul said:
> XSLT is about transformations of single document ( XML tree ), but not
> about content management.
> 
> Didier replies:
> Who said that? you? Is aggregating document fragment doing content
> management or simply using content? we may not have the same definition of
> the word management. But please check if you have the same one as the others
> ;-)

Yes, I said that.  And I explained why I think placing more and more garbage
into XSLT core is not good, but placing scalable mechanisms is good.
 
> Paul said:
> document()  function is a logical hack . Content management
> ( aggregation, addressing, storage, updating, versioning  e t.c. et c. )
> is another problem domain. Trying to turn XSLT into silver bullet for
> *anything* is very understandable ( because to me - XSLT is in *much*
> better shape than some other things ), but such an attempt  already
> caused some absolutely useless features, bloating the engine.
> 
> Didier replies:
> Saying that the document function is a logical hack is again a matter of
> taste or grounded in a comparison against better ways to do things. I won't
> comment on that. Saying that content management is an other domain is right
> and I do not contradict that. And finally I won't comment on the last
> opinion about the silver bullet since I am a technologies agnostic and do
> not favor a technique over the other. I would simply say that there is a
> wave of investments and tools and I have to adapt to this wave and these
> tools as good or as bad they may are. XML is not the most efficient language
> but at least we made tremendous progress by all agreeing on it. Imagine if
> all the people on earth would have simply agreed on a single alphabet. Even
> if the languages are different, we have at least agreed on a same alphabet.
> XSLT may be weird to learn at first, but at least it does a good job to
> transform a set of documents into different rendition languages. This is
> very useful when you have to provide some information or build an
> application for Cell phones and classical browsers.

Probably, transformation of a 'set of XML documents'  is a big usecase.
I don't think it is the only one.

> Paul said:
> I think it is obvious, that  having 'eval' and 'node-set' in the core
> is *much* closer to original XSLT purpose than  getting fancy
> ( 'non-stradard' ) way of navigating multiple XML documents
> ( and their fragments ).
> 
> Didier replies:
> Paul, I am a patient man but please do some homework before talking with
> such assurance. Please...

What do you mean? So it is not obvious what I'm saying? Sorry, I'l try 
not use the word 'obvious' anymore before explaning in detail what do I mean. 
Or maybe you have some particular question about what I'm saying? 
Please ask the question then. 

I'l try to explain.

> Paul said:
> With current XSLT extensibility features, almost everything could
> be done with extensions.
> 
> There is actualy no need in belowed 'for' loop in XSLT,
> because the same functionality could be easily implemented
> with extension:range + extension:node-set
> 
> That means if one needs to grab some part of the
> separate document - this also could be done with
> 
> extension:give-me-part-of-the-document-or-some-data-from-database +
> extension:node-set.
> 
> Didier replies:
> True but not in a portable and standard way, doesn't it?

What is 'portable' and what is 'standard' ?

Oh, yes - the current semantcis  of document() is 'standard'. 
'send/recv + read/write' in perl core are also 'standard'. 

There are some useless things in XSLT which are 'standard' ,
but at the same time XSLT lacks the core typecast to node-set.

And this all is 'standard' and 'portable'. So what ?

> Didier said:
> > Off course in the case of b) you can say that there are some serious
> > commercial problems.
> 
> Paul said:
> No. I'm talking only about the design. About balancing functionality
> between 'core' and 'layers'. XSLT is good in that balancing, but
> not perfect. I think there are some useless things in there, but 'eval'
> and 'node-set' are missing in the core ;-) For example.
> 
> Actualy, that is all understandable. The idea was to make something
> for 'documents'. Unfortunately, the world is not a heap of plain XML files.
> Not at all, actualy.
> 
> Didier replies:
> So on one hand you say that XSLT is well balanced and on the other hand you
> say that the document() function is a hack. Humm, I am getting a bit
> confused here Paul.

What in particular is confising?

I'm saying that XSLT is in principle well balanced, if comparing it to some 
other things, but some parts of XSLT are not. And document()/part is the example 
of not well-balanced thing.  It is better than nothing, of course, but this will 
cause some problems,  similiar to perl's 'send/recv + read/write' e t.c., when 
media other  than plain XML file will come into the game.

What is not clear ?

> The funny thing is that we like XSLT because of XPath part.
> XPath part is the only part of XSLT free of 'XML-mania'.
> XPath is good-old-UNIX-alike-command-line-grep-alike-beast.
> 
> Small beast which requires you to type those */[] things. Verrrrrry
> bad for 'end-user'.
> 
> Didier replies:
> But you are precisely arguing against a particular XPath construct. Again, I
> am confused.

document() is a hack.  document() is not a part of XPath - that's
a logical hack.  The rest of XPath is reasonable ( well ... not realy ..., 
but that's another ( long ) story. )

> Conclusion:
> Please Paul, do me a favor, read again the XPath and XSLT specs before
> replying.

What particular part you want me to read?  

Rgds.Paul.




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
Follow-Ups:
- Re: Future XSLT expansion. ( Re: Microsoft XSL and Conformance )
  - From: Rick Geimer
- RE: Future XSLT expansion. ( Re: Microsoft XSL and Conformance )
  - From: Didier PH Martin
- RE: Future XSLT expansion.
  - From: Jonathan Borden
References:
- RE: Future XSLT expansion. ( Re: Microsoft XSL and Conformance )
  - From: Didier PH Martin
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]