This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [regex] Speed up somewhat the cases in bug-regex11.c
- From: Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Ulrich Drepper <drepper at redhat dot com>, libc-alpha at sources dot redhat dot com
- Date: Tue, 30 Nov 2004 09:51:18 +0100
- Subject: Re: [regex] Speed up somewhat the cases in bug-regex11.c
- References: <BDAE8458.3E1%paolo.bonzini@lu.unisi.ch> <20041129222655.GA15147@sunsite.mff.cuni.cz>
Could you please have a look? I'm not sure what should happen when
bkref_idx is -1 (i.e. when search_cur_bkref_entry fails to find a bkref
index) and you are more familiar with this part of the code.
You just don't recurse: the old code used a while loop instead of
do/while, so it exited before dereferencing ent.
I checked whether specializing the loop for bkref_idx == 1 will give any
improvement, and it does not. An alternative possibility which I did
not try would be to add a dummy bkref_ent but I doubt it is worth.
I thought that this was impossible because the other places crashed
loudly when search_cur_bkref_entry failed. While I am slowly
understanding more of the algorithm, I am still far from being familiar
with it. :-(
I include a version of the patch without the indentation changes, and
the full one. Tested on the glibc and sed testsuites.
Paolo
2004-11-30 Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (check_dst_limits_calc_pos_1): Check
for bkref_idx == -1, and don't recurse in that case.
--- regexec.c.old 2004-11-30 09:37:58.000000000 +0100
+++ regexec.c 2004-11-30 09:38:29.000000000 +0100
@@ -1891,49 +1891,49 @@ check_dst_limits_calc_pos_1 (mctx, bound
switch (dfa->nodes[node].type)
{
case OP_BACK_REF:
- {
- struct re_backref_cache_entry *ent = mctx->bkref_ents + bkref_idx;
- do
- {
- int dst, cpos;
+ if (bkref_idx != -1)
+ {
+ struct re_backref_cache_entry *ent = mctx->bkref_ents + bkref_idx;
+ do
+ {
+ int dst, cpos;
- if (ent->node != node)
- continue;
+ if (ent->node != node)
+ continue;
- if (subexp_idx <= 8 * sizeof (ent->eps_reachable_subexps_map)
- && (ent->eps_reachable_subexps_map
- & (1 << (subexp_idx - 1))) == 0)
- continue;
+ if (subexp_idx <= 8 * sizeof (ent->eps_reachable_subexps_map)
+ && (ent->eps_reachable_subexps_map
+ & (1 << (subexp_idx - 1))) == 0)
+ continue;
- /* Recurse trying to reach the OP_OPEN_SUBEXP and
- OP_CLOSE_SUBEXP cases below. But, if the
- destination node is the same node as the source
- node, don't recurse because it would cause an
- infinite loop: a regex that exhibits this behavior
- is ()\1*\1* */
- dst = dfa->edests[node].elems[0];
- if (dst == from_node)
- {
- if (boundaries & 1)
- return -1;
- else /* if (boundaries & 2) */
- return 0;
- }
+ /* Recurse trying to reach the OP_OPEN_SUBEXP and
+ OP_CLOSE_SUBEXP cases below. But, if the
+ destination node is the same node as the source
+ node, don't recurse because it would cause an
+ infinite loop: a regex that exhibits this behavior
+ is ()\1*\1* */
+ dst = dfa->edests[node].elems[0];
+ if (dst == from_node)
+ {
+ if (boundaries & 1)
+ return -1;
+ else /* if (boundaries & 2) */
+ return 0;
+ }
- cpos = check_dst_limits_calc_pos_1 (mctx, boundaries,
- subexp_idx, dst, bkref_idx);
-
- if (cpos == -1 /* && (boundaries & 1) */)
- return -1;
-
- if (cpos == 0 && (boundaries & 2))
- return 0;
+ cpos =
+ check_dst_limits_calc_pos_1 (mctx, boundaries, subexp_idx,
+ dst, bkref_idx);
+ if (cpos == -1 /* && (boundaries & 1) */)
+ return -1;
+ if (cpos == 0 && (boundaries & 2))
+ return 0;
- ent->eps_reachable_subexps_map &= ~(1 << (subexp_idx - 1));
- }
- while (ent++->more);
- break;
- }
+ ent->eps_reachable_subexps_map &= ~(1 << (subexp_idx - 1));
+ }
+ while (ent++->more);
+ }
+ break;
case OP_OPEN_SUBEXP:
if ((boundaries & 1) && subexp_idx == dfa->nodes[node].opr.idx)
--- regexec.c.old 2004-11-30 09:37:58.000000000 +0100
+++ regexec.c 2004-11-30 09:38:29.000000000 +0100
@@ -1891,6 +1891,7 @@ check_dst_limits_calc_pos_1 (mctx, bound
switch (dfa->nodes[node].type)
{
case OP_BACK_REF:
+ if (bkref_idx != -1)
{
struct re_backref_cache_entry *ent = mctx->bkref_ents + bkref_idx;
do
@@ -1932,8 +1933,8 @@ check_dst_limits_calc_pos_1 (mctx, bound
ent->eps_reachable_subexps_map &= ~(1 << (subexp_idx - 1));
}
while (ent++->more);
- break;
}
+ break;
case OP_OPEN_SUBEXP:
if ((boundaries & 1) && subexp_idx == dfa->nodes[node].opr.idx)