[PATCH] D138766: [InstCombine] If loading from small alloca, load whole alloca and perform variable extraction
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 14 13:17:30 PST 2022
lebedev.ri added a comment.
Thank you for looking into it!
In D138766#3996126 <https://reviews.llvm.org/D138766#3996126>, @nlopes wrote:
> In D138766#3995966 <https://reviews.llvm.org/D138766#3995966>, @lebedev.ri wrote:
>
>> In D138766#3995896 <https://reviews.llvm.org/D138766#3995896>, @nlopes wrote:
>>
>>> FWIW, I've discovered today that GVN does a similar optimization (but without the freeze..).
>>> See here (scroll to the bottom): https://web.ist.utl.pt/nuno.lopes/alive2/index.php?hash=aed14c64378404c9&test=Transforms%2FPhaseOrdering%2FX86%2Fvec-load-combine.ll
>>
>> That seems to be with constant indexes, though?
>
> True.
> So it could use a simple extractelement rather than bit masking.
Define "could"? Define "simple"?
I've looked at alternative lowerings (`shufflevector` or chain of `extractelement`'s),
and they all result in worse codegen. We can not use a single `extractelement,
because the byte offset may not be a multiple of the element size.
The shift is the optimal lowering here, any alternative chosen lowering
would need to be canonicalized into it, and and which point why bother?
(Yes, i will look into doing this in SROA.)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138766/new/
https://reviews.llvm.org/D138766
More information about the llvm-commits
mailing list