This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Spelling checking templates (Was: RE: Re: attribute closest match)
- From: Dimitre Novatchev <dnovatchev at yahoo dot com>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Mon, 12 Aug 2002 13:42:19 -0700 (PDT)
- Subject: [xsl] Spelling checking templates (Was: RE: Re: attribute closest match)
- Reply-to: xsl-list at lists dot mulberrytech dot com
--- "Matthew L. Avizinis" <mla at gleim dot com> wrote:
> I'm happy knowing that there are widely varying differences of
> opinion on this matter.
> So, Dimitre, how precise is precise? If I were to define closest
> match to be words that contain all of the letters with one
> transposition, e.g. hte for the, or the spelling is correct except
> for one letter, e.g. mofe for mode, would that, iyo, be precise
> enough?
> Of course a spelling checker might prevent many of these kinds of
> errors in data entry, but it would still be, I believe, an
> interesting exercise to be able to catch these kinds of errors if
>data was entered without a spellchecker abvailable (this would be
> another type of error I would include later because it seems like it
> would be more difficult to check for).
> Any more help, suggestions, (or even code)?
> thanks,
> > >
> > > Matthew L. Avizinis <mailto:mla@gleim.com>
> > > Gleim Publications, Inc.
> > > 4201 NW 95th Blvd.
> > > Gainesville, FL 32606
> > > (352)-375-0772
> > > www.gleim.com <http://www.gleim.com>
> >
> >
> > Can be done in XSLT, but first you have to define precisely
> > "_closest match_".
Hi Mathew,
This is quite straightforward to do using FXSL. Please, find bellow the
code that solves your particular problem, and that may be used as part
of a spelling checker, implemented in XSLT.
Should I mention, that I'm using FXSL here? :o)
Suppose you have the following source xml:
words2.xml:
----------
<elements>
<element cana="1"/>
<element cna="2"/>
<element an="3"/>
<element con="4"/>
<element cbb="5"/>
</elements>
The result of applying the transformation presented bellow will be:
<elements>
<element can="1" />
<element can="2" />
<element can="3" />
<element can="4" />
<element />
</elements>
As you can see, deletion, replacement and adding of a single character,
as well as transposing two adjacent characters is corrected. Two
replacements are not handled.
You may play with any other combinations of attribute names.
Here's the transformation:
spelling.xsl:
------------
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:vendor="urn:schemas-microsoft-com:xslt"
xmlns:delLetter="f:delLetter"
xmlns:addLetter="f:addLetter"
xmlns:addLetterSingle="f:addLetterSingle"
xmlns:repLetter="f:repLetter"
xmlns:repLetterSingle="f:repLetterSingle"
xmlns:transPair="f:transPair"
exclude-result-prefixes="vendor delLetter addLetter
repLetter transPair repLetterSingle addLetterSingle"
>
<xsl:import href="str-foldl.xsl"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<delLetter:delLetter/>
<addLetter:addLetter/>
<repLetter:repLetter/>
<transPair:transPair/>
<repLetterSingle:repLetterSingle/>
<addLetterSingle:addLetterSingle/>
<xsl:variable name="validChars"
select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:template match="/">
<xsl:variable name="vrtfCloseWords">
<xsl:call-template name="closeWords">
<xsl:with-param name="pWord" select="'can'"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vCloseWords"
select="vendor:node-set($vrtfCloseWords)/*"/>
<elements>
<xsl:for-each select="/elements/element">
<xsl:copy>
<xsl:for-each select="@*[name()=$vCloseWords]">
<xsl:attribute name="can">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:for-each>
</xsl:copy>
</xsl:for-each>
</elements>
</xsl:template>
<xsl:template name="closeWords">
<xsl:param name="pWord"/>
<xsl:call-template name="delLetterWords">
<xsl:with-param name="pWord" select="$pWord"/>
</xsl:call-template>
<xsl:call-template name="repLetterWords">
<xsl:with-param name="pWord" select="$pWord"/>
</xsl:call-template>
<xsl:call-template name="addLetterWords">
<xsl:with-param name="pWord" select="$pWord"/>
</xsl:call-template>
<xsl:call-template name="transPairWords">
<xsl:with-param name="pWord" select="$pWord"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="transPairWords">
<xsl:param name="pWord"/>
<xsl:variable name="vftransPair"
select="document('')/*/transPair:*[1]"/>
<xsl:variable name="vrtf-accum">
<accum>
<position>1</position>
<word><xsl:value-of select="$pWord"/></word>
<closewords></closewords>
</accum>
</xsl:variable>
<xsl:variable name="vaccum"
select="vendor:node-set($vrtf-accum)/*"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vftransPair"/>
<xsl:with-param name="pA0" select="$vaccum"/>
<xsl:with-param name="pStr" select="$pWord"/>
</xsl:call-template>
</xsl:variable>
<xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/>
</xsl:template>
<xsl:template match="transPair:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vWord" select="$arg1/word"/>
<xsl:variable name="vCloseWords" select="$arg1/closewords"/>
<xsl:variable name="vNewWord"
select="concat(substring($vWord, 1, $vPos - 1),
substring($vWord, $vPos + 1, 1),
$arg2,
substring($vWord, $vPos + 2)
)"/>
<position><xsl:value-of select="$vPos + 1"/></position>
<word><xsl:value-of select="$vWord"/></word>
<closewords>
<xsl:copy-of select="$vCloseWords/*"/>
<word><xsl:value-of select="$vNewWord"/></word>
</closewords>
</xsl:template>
<xsl:template name="delLetterWords">
<xsl:param name="pWord"/>
<xsl:variable name="vfDelLetter"
select="document('')/*/delLetter:*[1]"/>
<xsl:variable name="vrtf-accum">
<accum>
<position>1</position>
<word><xsl:value-of select="$pWord"/></word>
<closewords></closewords>
</accum>
</xsl:variable>
<xsl:variable name="vaccum"
select="vendor:node-set($vrtf-accum)/*"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vfDelLetter"/>
<xsl:with-param name="pA0" select="$vaccum"/>
<xsl:with-param name="pStr" select="$pWord"/>
</xsl:call-template>
</xsl:variable>
<xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/>
</xsl:template>
<xsl:template name="repLetterWords">
<xsl:param name="pWord"/>
<xsl:variable name="vfRepLetter"
select="document('')/*/repLetter:*[1]"/>
<xsl:variable name="vrtf-accum">
<accum>
<position>1</position>
<word><xsl:value-of select="$pWord"/></word>
<closewords></closewords>
</accum>
</xsl:variable>
<xsl:variable name="vaccum"
select="vendor:node-set($vrtf-accum)/*"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vfRepLetter"/>
<xsl:with-param name="pA0" select="$vaccum"/>
<xsl:with-param name="pStr" select="$pWord"/>
</xsl:call-template>
</xsl:variable>
<xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/>
</xsl:template>
<xsl:template name="addLetterWords">
<xsl:param name="pWord"/>
<xsl:variable name="vfaddLetter"
select="document('')/*/addLetter:*[1]"/>
<xsl:variable name="vrtf-accum">
<accum>
<position>1</position>
<word><xsl:value-of select="concat($pWord, ' ')"/></word>
<closewords></closewords>
</accum>
</xsl:variable>
<xsl:variable name="vaccum"
select="vendor:node-set($vrtf-accum)/*"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vfaddLetter"/>
<xsl:with-param name="pA0" select="$vaccum"/>
<xsl:with-param name="pStr" select="concat($pWord, ' ')"/>
</xsl:call-template>
</xsl:variable>
<xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/>
</xsl:template>
<xsl:template match="delLetter:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vWord" select="$arg1/word"/>
<xsl:variable name="vCloseWords" select="$arg1/closewords"/>
<xsl:variable name="vNewWord"
select="concat(substring($vWord, 1, $vPos - 1),
substring($vWord, $vPos + 1)
)"/>
<position><xsl:value-of select="$vPos + 1"/></position>
<word><xsl:value-of select="$vWord"/></word>
<closewords>
<xsl:copy-of select="$vCloseWords/*"/>
<word><xsl:value-of select="$vNewWord"/></word>
</closewords>
</xsl:template>
<xsl:template match="addLetter:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vfaddLetter"
select="document('')/*/addLetterSingle:*[1]"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vfaddLetter"/>
<xsl:with-param name="pA0" select="$arg1"/>
<xsl:with-param name="pStr" select="$validChars"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vResults"
select="vendor:node-set($vrtfResults)/*"/>
<position><xsl:value-of select="$vPos + 1"/></position>
<xsl:copy-of select="$vResults[not(self::position)]"/>
</xsl:template>
<xsl:template match="addLetterSingle:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vWord" select="$arg1/word"/>
<xsl:variable name="vCloseWords" select="$arg1/closewords"/>
<xsl:variable name="vNewWord"
select="concat(substring($vWord, 1, $vPos - 1),
$arg2,
substring($vWord, $vPos)
)"/>
<position><xsl:value-of select="$vPos"/></position>
<word><xsl:value-of select="normalize-space($vWord)"/></word>
<closewords>
<xsl:copy-of select="$vCloseWords/*"/>
<word><xsl:value-of select="$vNewWord"/></word>
</closewords>
</xsl:template>
<xsl:template match="repLetter:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vfrepLetter"
select="document('')/*/repLetterSingle:*[1]"/>
<xsl:variable name="vrtfResults">
<xsl:call-template name="str-foldl">
<xsl:with-param name="pFunc" select="$vfrepLetter"/>
<xsl:with-param name="pA0" select="$arg1"/>
<xsl:with-param name="pStr"
select="translate($validChars, $arg2, '')"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vResults"
select="vendor:node-set($vrtfResults)/*"/>
<position><xsl:value-of select="$vPos + 1"/></position>
<xsl:copy-of select="$vResults[not(self::position)]"/>
</xsl:template>
<xsl:template match="repLetterSingle:*">
<xsl:param name="arg1" select="/.."/> <!-- A0 -->
<xsl:param name="arg2"/>
<xsl:variable name="vPos" select="$arg1/position"/>
<xsl:variable name="vWord" select="$arg1/word"/>
<xsl:variable name="vCloseWords" select="$arg1/closewords"/>
<xsl:variable name="vNewWord"
select="concat(substring($vWord, 1, $vPos - 1),
$arg2,
substring($vWord, $vPos + 1)
)"/>
<position><xsl:value-of select="$vPos"/></position>
<word><xsl:value-of select="$vWord"/></word>
<closewords>
<xsl:copy-of select="$vCloseWords/*"/>
<word><xsl:value-of select="$vNewWord"/></word>
</closewords>
</xsl:template>
</xsl:stylesheet>
Hope this helped.
=====
Cheers,
Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL
__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list