[PATCH] D94447: [PredicateInfo] Generalize processing of conditions

Wed Jan 20 07:37:53 PST 2021

fhahn added a comment.

In D94447#2505439 <https://reviews.llvm.org/D94447#2505439>, @nikic wrote:

> In D94447#2505374 <https://reviews.llvm.org/D94447#2505374>, @fhahn wrote:
>
>> I believe compile-time is at least partly a motivation for some limitation here, but not in a way that is exposed by CTMark. I think here we mostly want to guard against the worst case (deeply nested `AND`/`OR`). Those kind of cases are not really covered by the tests in `CTMark` (or other parts in the `test-suite`), but are surprisingly common in practice (e.g. various code-generators that feed into Clang/LLVM). I think it would be desirable to have a limit on the number of items in the worklist to guard against that.
>
> You make a good point here. I think it's worthwhile to distinguish two cases though: The first is where you have a simple chain of and's. I believe this is harmless. Even if you have 32 and's in a row, you would introduce 32 new variables, and that's linear in the input size. This is not a pathological case. The second case is where you have code like this:
>
>   %b = and i1 %a, %a
>   %c = and i1 %b, %b
>   %d = and i1 %c, %c
>   ...
>
> That is, where and operands get reused in a tree. In this case, my previous code introduced an exponential number of variables for `%a`, which is certainly not good. I have updated the implementation to add a `Visited` set, which will cut down the number of renames to linear. I've also added some test cases for deep chains and trees (as well as general tests for nesting).
>
> Does this address this concern sufficiently, or do you want the linear case to be limited as well?

I think there is still a potentially problematic case: multiple branches could use the same deep chain/tree of `and`/`or`s (or different entries of the same tree), in which case the new approach would still have to traverse a lot of instructions in the worst case I think.

Granted, this is unlikely to happen often in practice, but the original goal of PredicateInfo is to be as fast as possible IIRC, so ensuring we only traverse and create a constant number of instructions per branch/assume would still be desirable I think.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94447/new/

https://reviews.llvm.org/D94447