This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Regular expression functions (Was: Re: comments on December F&O draft)


Hi Marc,

> you mean: the *[index] is throwing all named subregexes on one array
> and getting the second regardless it's name, right?

Yes.

> getting an actual parenthesis group out of a named subregex would be
> different, no?

I don't think it has to be, if you use elements with some standard
name to represent them...

Say you had:

<regex name="fancy-number">[0-9]+(\.[0-9]+)?([Ee][+-][0-9]+)?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">
...
</matcher>

And you were matching the string:

  "12.5 3.4E-2"

I was imagining that you'd get built a tree that looked like
(formatted for clarity - the only whitespace would actually be a
single space between the two fancy-number elements):

  <fancy-number>
    12
    <rxp:match>.5</rxp:match>
  </fancy-number>
  <fancy-number>
    3
    <rxp:match>.4</rxp:match>
    <rxp:match>E-2</rxp:match>
  </fancy-number>

Where rxp is associated with some namespace like (for XPath anyway):

  http://www.w3.org/2002/XPath/RegExp

So the values of the nodes selected by the following paths would be:

  /                        =>  ("12.5 3.4E-2")
  /fancy-number            =>  ("12.5", "3.4E-2")
  /fancy-number[1]         =>  ("12.5")
  /fancy-number[1]/node()  =>  ("12", ".5")
  /fancy-number[1]/text()  =>  ("12")
  /fancy-number[1]/*[1]    =>  (".5")
  /fancy-number[1]/*[2]    =>  ()
  /fancy-number[2]         =>  ("3.4E-2")
  /fancy-number[2]/*       =>  (".4", "E-2")


If you have named subexpressions within a named subexpression, that
just changes the name of the element created for that subexpression.
So if you had:

<regex name="mantissa">[0-9]+(\.[0-9]+)?</regex>
<regex name="exponent">[Ee][+-][0-9]+</regex>
<regex name="fancy-number">:mantissa::exponent:?</regex>
<matcher name="two-numbers" regexp=":fancy-number:\w:fancy-number:">
...
</matcher>

Matching the same string would give you a tree like:

  <fancy-number>
    <mantissa>12<rxp:match>.5</rxp:match></mantissa>
  </fancy-number>
  <fancy-number>
    <mantissa>3<rxp:match>.4</rxp:match></mantissa>
    <exponent>E-2</exponent>
  </fancy-number>

I should note that nothing existing in XPath or XSLT automatically
creates a tree in this way. However, several EXSLT functions do (as a
means of returning 'sequences', in fact!). I suspect that the
introduction of user-defined functions in XSLT will lead to more
functions that do this, but don't know whether people would feel it
was acceptable for a built-in function.
  
Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]