This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
XPath grammar questions
- From: Sean Russell <ser at germane-software dot com>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Sun, 17 Mar 2002 09:03:35 -0800
- Subject: [xsl] XPath grammar questions
- Reply-to: xsl-list at lists dot mulberrytech dot com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello everyone,
Ya cain't do XSL w/o XPath, so I suppose this is as good a forum as any to ask
these questions. I'm subscribed to this list in digest form, so if you reply
only to the list, allow for lag in my responses.
I'm writing an XPath parser (and evaluator) in Ruby, for an XML parser called
REXML. Actually, I've written the XPath parser three times already; this
fourth time, I broke down and just implemented a lexer (more or less)
conforming to the XPath grammar. It works more or less properly, but I have
a couple of places where it breaks down, and if there are any XPath gurus who
can tell me how I'm misunderstanding the XPath spec, I'd appreciate the
feedback.
The first case is in a path submitted by Tobias Reif, that originated, as I
recall, from someone on this list:
*[* and not(*/node()) and not(*[not(@style)]) and not(*/@style != */@style)]
Specifically, it's the 'not(*/node())' that I'm having trouble with. The
XPath spec states that:
not( boolean ) -> boolean
This would imply that '*/node()' evaluates to a boolean. However, it also
states that paths such as:
ancestor::node()
evaluates to a set of matching nodes. Further, I had assumed that the path:
*/node()
by itself would also result in a set of nodes.
I have a group of theories about this, but I'm not quite grokking the intent
of XPath. I don't see how the same path should evaluate to two different
results. In any case, there have been a number of successful implementations
of XPath, so I know I'm missing something.
The second (and at this point, more critical) problem I'm having is with
function names. Take:
[normalize-space(@name)='x']
If you follow the grammar, the evaluation is:
Predicate->Expr->OrExpr->AndExpr->EqualityExpr->RelationalExpr->
AdditiveExpr
at which point it matches the rule:
AdditiveExpr:: AdditiveExpr '-' MultiplicativeExpr
where you effectively have "normalize" "-" "space(@name)='x'". What my code
does at this point is hang; 'normalize' gets caught in an endless, recursive
evaluation loop. The only way I think I can solve this at this point is for
checking for endless recursion. I don't want to do this because it doesn't
seem like I should have to... the grammar should be unambiguous. Again, I
suspect that my code is at fault, but when I run through the grammar by hand,
I get the same result. Rather, I suspect that I can avoid this particular
recursive loop by changing the order of the rule evaluation, but then I get
worse recursive loops in other paths. There doesn't seem to be an elegant
solution.
Any help would be appreciated. If this isn't an appropriate topic for this
list, feel free to email me directly at:
ser@germane-software.com
Thanks!
- -- SER
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE8lMxnURPYGmmGtGcRAtkrAJ0X/JHkKaWWMHr8o0GB/U1UhDTUbQCfSPxY
+Fi62m/vEgetC/ieWeUkId4=
=x3cC
-----END PGP SIGNATURE-----
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list