[PATCH] D138766: [InstCombine] If loading from small alloca, load whole alloca and perform variable extraction

Wed Dec 14 16:43:02 PST 2022

lebedev.ri added a comment.

In D138766#3996200 <https://reviews.llvm.org/D138766#3996200>, @nlopes wrote:

> In D138766#3996152 <https://reviews.llvm.org/D138766#3996152>, @lebedev.ri wrote:
>
>> Thank you for looking into it!
>>
>> In D138766#3996126 <https://reviews.llvm.org/D138766#3996126>, @nlopes wrote:
>>
>>> In D138766#3995966 <https://reviews.llvm.org/D138766#3995966>, @lebedev.ri wrote:
>>>
>>>> In D138766#3995896 <https://reviews.llvm.org/D138766#3995896>, @nlopes wrote:
>>>>
>>>>> FWIW, I've discovered today that GVN does a similar optimization (but without the freeze..).
>>>>> See here (scroll to the bottom): https://web.ist.utl.pt/nuno.lopes/alive2/index.php?hash=aed14c64378404c9&test=Transforms%2FPhaseOrdering%2FX86%2Fvec-load-combine.ll
>>>>
>>>> That seems to be with constant indexes, though?
>>>
>>> True.
>>> So it could use a simple extractelement rather than bit masking.
>>
>> Define "could"? Define "simple"?
>> I've looked at alternative lowerings (`shufflevector` or chain of `extractelement`'s),
>> and they all result in worse codegen. We can not use a single `extractelement,
>> because the byte offset may not be a multiple of the element size.
>> The shift is the optimal lowering here, any alternative chosen lowering
>> would need to be canonicalized into it, and and which point why bother?
>
> I meant for the GVN case, with constant indexes.

D'oh! Sorry!

> Your case is annoying as neither sufflevector or extractelement allow for easy vector extraction w/ dynamic indexes.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138766/new/

https://reviews.llvm.org/D138766