This is the mail archive of the glibc-bugs-regex@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regex/10290] New: using REG_ICASE can break ranges


Using a regular expression range like [C-a] works fine if compiled with
regcomp() with just the REG_EXTENDED flag, but if the REG_ICASE flag is added
too, regcomp() returns an error "Invalid range end".

Testing other ranges with REG_ICASE reveals:
    [A-Z^-z] is invalid: Invalid range end (11)
    [A-Z^_`a-z] is ok
    [C-a] is invalid: Invalid range end (11)
    [C-f] is ok
    [_-a] is invalid: Invalid range end (11)
    [<-a] is ok
    [z-{] is ok

It appears that regcomp() is capitalizing the range if the REG_ICASE flag is
used, thus [C-a] becomes [C-A] and since A comes before C, the range is invalid.
 Likewise, in locales that match ASCII, ^ becomes before z, but after Z, so
[A-Z^-z] becomes invalid, and _ comes after A but before a, so [_-a] becomes
invalid.

If this is not considered a bug, then at the very least, the regex(3) man page
should note the side-effects of using REG_ICASE.

-- 
           Summary: using REG_ICASE can break ranges
           Product: glibc
           Version: 2.9
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: drepper at redhat dot com
        ReportedBy: jbastian at redhat dot com
                CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
                    bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=10290

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]