[PATCH] D140851: [InstCombine]: Add cases for assume (X & Y != {0|Y})

Noah Goldstein via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 01:42:11 PST 2023


goldstein.w.n added a comment.

In D140851#4042727 <https://reviews.llvm.org/D140851#4042727>, @nikic wrote:

> LGTM
>
> In D140851#4040862 <https://reviews.llvm.org/D140851#4040862>, @goldstein.w.n wrote:
>
>> Also, if you have a moment about an unrelated thing. I'm working on a patch
>> for `SimplifyLibCalls` to see if there is a previous dominating `slen = strlen(s)` call
>> that will allow transformation for `str*(s, ...)` -> `mem*(s, ..., slen)` even if the string is non-constant.
>> To do this, it needs to check that between the `strlen` call and lib-call the memory
>> `*s` can't have been clobbered. I think the "right" way to do this is `memoryssa` but
>> it seems ridiculous (and overly consuming of compile time) to add `memoryssa` to all
>> of `instcombine` just for this one relatively unimportant pass. I saw we have `AAResults`
>> already computed so started with that, but it only works on a single basic-block (and since
>> we don't have `postdom` tree there doesn't seem to be any cheap way to collect all possible
>> affected basic-blocks between the `strlen` call and lib-call. My thought for how to handle
>> this was either:
>>
>> 1. Bite the bullet and use `memoryssa` throughout `instcombine`.
>
> Extremely unlikely. Both because of compile-time impact, and because this would add implementation complexity to all other transforms, that now need to keep MSSA up to date.
>
>> 2. Only handle `strlen` calls in the same basic-block as the lib-call and use `AAResults`
>
> This is possible -- probably easiest if it covers your motivating cases.
>
>> 3. Add a file to "canonicalize" lib calls before EarlyCSE by making ALL `str*(s,...)` functions `mem*(s, ..., strlen(s))`, let `EarlyCSE` remove all duplicate `strlen` calls, then make the `SimplifyLibCall` in `instcombine` replace `mem*(s, ..., strlen(s))` with the `str*` equivilent.
>
> Handling this in EarlyCSE/GVN does seem most principled. I'm not sure it's necessary to actually perform the conversion from str to mem+strlen in advance -- shouldn't we be able to look up whether we have an available strlen call, and perform the transform from str to mem if we do? Of course, it would be a bit of an awkward special case...

Thank you for the advise :)

The canonicalization wouldn't be necessary, it would be just be a simple way to avoid having to check the strlen case for each function (although
then again we would need to be able to reverse it later which may have the same level of complexity). Just searching for a dominating + valid
strlen call and using that isn't too much of a special case as is, most of the functions already check for a constant-time strlen so its really just
a matter of adding an extra branch in existing infrastructure.

Think I'll give EarlyCSE/GVN a shot (I was leaning towards EarlyCSE), if it doesn't work out i'll just do in the AAResults method.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140851/new/

https://reviews.llvm.org/D140851



More information about the llvm-commits mailing list