[PATCH] D28384: Update update_test_checks to work properly with phi nodes and other fun things.

Fri Jan 6 10:38:26 PST 2017

bryant added inline comments.

================
Comment at: utils/update_test_checks.py:176
+      if match.group(1) in vars_seen:
+        line = line.replace('%' + match.group(1), get_value_use(match.group(1)), 1)
+        continue
----------------
bryant wrote:
> bryant wrote:
> > You're limiting the number of replaced uses to 1. Can a line have more than one use?
> Just to clarify, what I meant is that you can do a lot better than manually replacing one at a time:
> 
> ```
> def genericize_check_lines(lines):
>     def ssaify(match):
>         v = match.group(1)
>         if v in var_seen:
>             rv = get_value_use(v)
>         else:
>             vars_seen.add('%' + v)
>             rv = get_value_definition(v)
>         return rv
>     for line in lines:
>         scrubbed_line = ...
>         new_ine = IRVALUE_RE.sub(ssaify, scrubbed_line)
>     ...
> ```
> 
> I suspect that underneath, `_sre` is building the replacement string incrementally, so it won't be quadratic.
> Yes, but you want to replace one at a time anyway.
> 
> 
> because it could be a use of "bb", and the line could be phi [a, bb], [b, bb17], [c, bb].
> 
> If you replace  all uses of bb, you'll mistakenly replace bb17.
>
> We can do better, but not in this way (see my other email, this will generate wrong answers) , and it's IMHO not worth it.
> 
> You can avoid doing the replacement directly by using the match start and end, but if you do, we either have to mess with the loop quite a bit.
> Even if we stop using finditer (because we are replacing the string it is using as it goes, and it does not keep up to date) and use match directly, we have to offset where it starts looking for matches by the amount of text we've added by converting to var defs/uses, etc, or risk false matches.
> 
> 
> None of this is worth it compared to keeping it simple like I did, imho.

```
13:34:06 ~> cat faster.py 

import re

xs = "PHI [a, bb], [b, bb17], [c, bb] " * 901
varz = {"bb": "nop", "bb17": "movzx"}
myre = re.compile(r"[a-z0-9]+")

vrz = {str(i) : str(i) + "a" for i in xrange(4000)}
vrz["sd"] = "g"

def pythonic():
    rv = xs
    def f(match):
        return varz.get(match.group(0), match.group(0))
    return myre.sub(f, rv)

def manual():
    rv = xs
    for match in myre.finditer(xs):
        rv = rv.replace(match.group(0), varz.get(match.group(0),
                        match.group(0)), 1)
    return rv

print pythonic()[:50]
print pythonic() == manual()
from timeit import timeit
print timeit("pythonic()", "from __main__ import *",  number=100)
print timeit("manual()", "from __main__ import *",  number=100)

13:35:55 ~> python faster.py 

PHI [a, nop], [b, movzx], [c, nop] PHI [a, nop], [
True
<timeit-src>:2: SyntaxWarning: import * only allowed at module level
1.07926893234
<timeit-src>:2: SyntaxWarning: import * only allowed at module level
12.0660321712
```

https://docs.python.org/2/library/re.html#re.sub

https://reviews.llvm.org/D28384