This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] regex performance improvement


Hi,

Attached patch is a part of regex performance improvement patch.
(It is against the current cvs version with the Jakub's patch in
http://sources.redhat.com/ml/libc-hacker/2002-11/msg00052.html)

Originally, I intended to submit my patch after the patch will be complete.
However, it is inefficient to adapt my development version to the cvs
each time.  Then I'd like to submit a patch bit by bit when it is ready.

This patch partly fixes the performance problem.
For example, in my environment "dc.sed test" in sed-3.02 become about
5 times faster:
 dc.sed(without this patch): about 1min40sec
 dc.sed(with this patch)   : about 20sec

Tough it might contain some errorneous, and I might re-submit refined
version.  However, I'd like to send it as a preview at present.

2002-11-21  Isamu Hasegawa  <isamu@yamato.ibm.com>

	* posix/regex_internal.h (state_array_t): New structure.
	(re_sub_match_last_t): Likewise.
	(re_sub_match_top_t): Likewise.
	(re_backref_cache_entry): Add new members.
	(re_match_context_t): Likewise.
	* posix/regexec.c (re_search_internal): Invoke prune_impossible_nodes
	to check the matching is really correct.
	Move the routin pruning the impossible nodes from here, ...
	(prune_impossible_nodes): To this function.
	(check_matching): Invoke check_subexp_matching_top, and replace
	redundant checking to transit_state_bkref invocation.
	(update_cur_sifted_state): Remove search_subexp invocation.
	(search_subexp): Remove this function.
	(check_dst_limits_calc_pos): Use search_cur_bkref_entry for
	optimization.
	(sift_states_bkref): Likewise.
	(transit_state): Invoke check_subexp_matching_top.
	(check_subexp_matching_top): New function.
	(transit_state_bkref): Remove unused array.
	(transit_state_bkref_loop): Use get_subexp instead of
	sift_states_backward.  Use search_cur_bkref_entry for optimization.
	(get_subexp): New function.
	(get_subexp_sub): Likewise.
	(find_subexp_node): Likewise.
	(check_arrival): Likewise.
	(expand_eclosures): Likewise.
	(expand_eclosures_sub): Likewise.
	(expand_bkref_cache): Likewise.
	(match_ctx_init): Initialize new members.
	(match_ctx_free): Release new members.
	(match_ctx_add_entry): Check duplication.
	(search_cur_bkref_entry): New function.
	(match_ctx_add_subtop): Likewise.
	(match_ctx_add_sublast): Likewise.

Thanks,
-- 
Isamu Hasegawa
IBM Japan, Ltd.

Attachment: patch.021121.gz
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]