This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Testcase for another regex bug]


Hi,

Subject: [PATCH] Testcase for another regex bug 
From: Jakub Jelinek <jakub@redhat.com>

> > Jack Howarth wrote:
> > > Ulrich,
> > >    While the new patch fixes bug-regex11.c, it doesn't seem to fix
> > > the make check failure in sed 3.02 when built with..
> > 
> > Then come up with a new test case.
> 
> Here it is. tests[2] fails to match.

I'm sorry for being late...
I prepared a patch to fix the problem.

The problem was that there are some defects in the evaluations of back
references.  And I needed to restructure these evaluations.
(I'm sorry for such a big patch.)

However, it might not be a sufficient patch, since dc.sed of sed-3.02
seems to fall infinite loop.
On the other hand, perhaps it might not be a bug, because the infinite
loop of dc.sed seems not to be in regex but in the dc.sed script, and
the behavior of regex was a bit changed from old regex.
  (e.g. Executing the following command returns the result "NG".
   $ echo aa | sed 's/\(a\+\)\{3\}/NG/'
   However, "\(a\+\)\{3\}" shouldn't match with "aa", should it?
   We fix these some dubious behaviors, and it might prevent dc.se
   from desirable behavior.)

Then if you see other problems, please let me know.


2002-09-27  Isamu Hasegawa  <isamu@yamato.ibm.com>

	* posix/regcomp.c (reg_free): Free the debug area.
	(re_compile_internal): Allocate debug area for the input string.
	(create_initial_state): Check the back references in initial states
	if they are really match null string in the initial state.
	(parse_reg_exp): Mark the dfa that the dfa can have plural matchings.
	(parse_expression): Likewise.
	(parse_bracket_exp): Likewise.
	* posix/regex_internal.c (re_node_set_intersect): Remove unused
	function.
	(re_node_set_contains): Change to return the index of node.
	* posix/regex_internal.h (re_backref_cache_entry): Change the members.
	(re_match_context_t): Likewise.
	(struct re_dfa_t): Likewise.
	(re_sift_context_t): New structure.
	* posix/regexec.c (match_ctx_clear_flag): New function.
	(sift_ctx_init): Likewise.
	(update_cur_sifted_state): Likewise.
	(add_epsilon_src_nodes): Likewise.
	(sub_epsilon_src_nodes): Likewise.
	(check_subexp_limits): Likewise.
	(search_subexp): Likewise.
	(sift_states_bkref): Likewise.
	(merge_state_array): Likewise.
	(sift_states_iter_bkref): Remove unused function.
	(add_epsilon_backreference): Remove unused function.
	(re_search_internal): Adapt new members and interfaces.
	(check_matching): Check the back references in initial states
	if they are really match null string in the initial state.
	(proceed_next_node): Change the evaluation of back references,
	since we have real registers here.
	(set_regs): Adapt new interface of proceed_next_node.
	(sift_states_backward): Add invocation of update_cur_sifted_state
	instead of add_epsilon_backreference.
	Add a sentinel to the outermost while loop.
	Move the handling of back references to sift_states_bkref function,
	since we can't handle some back references (e.g. a back reference
	which match NULL string) here.
	(transit_state_mb): Handle the new member max_mb_elem_len.
	(transit_state_bkref_loop): Move the evaluation of back references to
	search_subexp function, since we can't evaluate some back references
	(e.g. a back reference which can have plural matchings) here.
	(match_ctx_init): Adapt the new member.
	(match_ctx_add_entry): Adapt the new members.

Thanks,
-- 
Isamu Hasegawa
IBM Japan, Ltd.

Attachment: patch.020927.gz
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]