This is the mail archive of the glibc-bugs-regex@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regex/1278] regex undefined behavior with shifting past word length


------- Additional Comments From eggert at gnu dot org  2005-09-02 23:17 -------
Andreas is right.  For example, "unsigned long int x = ~0u;" will not
have an all-1s value on most 64-bit hosts.

In this particular hunk, ~0u would also work since the destination
type is unsigned short int.  So if you'd really rather use ~0u I
guess that would be OK.  However, as a style matter, it is confusing
to use ~0u in some unsigned contexts, while using -1 in other unsigned
contexts.  Since -1 always works, it's more consistent to use it in
all unsigned contexts.

For example, suppose someone later changes eps_reachable_subexps_map
from unsigned short int to unsigned long int, for performance reasons.
If the code used ~0u here, it would have to be changed to ~ (unsigned
long int) 0, and it's quite possible that people would forget to make
that change.  Whereas if we simply change it to -1 now, it will work
regardless of later changes like this.

I should mention that the situation is different in signed contexts.
In general one must use ~ (SIGNED_TYPE) 0 in that case to get an
all-1s pattern.  But signed bit-twiddling is trickier (since one must
in general worry about ~0 == 0 and overflow issues), and I'd rather
that the regex code stuck with unsigned unsigned bit-twiddling.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=1278

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]