[llvm-dev] InstCombine question on combineLoadToOperationType
Mehdi Amini via llvm-dev
llvm-dev at lists.llvm.org
Thu Nov 17 14:10:23 PST 2016
> On Nov 16, 2016, at 11:23 AM, Friedman, Eli via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> On 11/15/2016 4:22 PM, Pete Couperus via llvm-dev wrote:
>> Hello,
>>
>> Context: We have a backend where v32i1 is a Legal type, but the storage for v32i1 is not 32-bits/uses a different instruction sequence.
>> We ran into an issue because combineLoadToOperationType changed v32i1 loads into i32 loads, so a sequence like:
>> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>> %a = load <32 x i1>, <32 x i1>* %A
>> store <32 x i1> %a, <32 x i1>* %B
>> ret void
>> }
>>
>> Is transformed to:
>> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>> %1 = bitcast <32 x i1>* %A to i32*
>> %a1 = load i32, i32* %1, align 4
>> %2 = bitcast <32 x i1>* %B to i32*
>> store i32 %a1, i32* %2, align 4
>> ret void
>> }
>>
>> This looks to be intentional.
>> Is there a way to specify in the data-layout that v32i1 storage is not 32-bits?
>
> No, not at the moment. You could propose something, but you'd probably have a hard time convincing anyone it's necessary; nobody has cared about this for a very long time.
>
>> Absent that, is there any other reliable way to retain the original vector loads/store without just disabling this part of InstCombine?
>
> No, and you'll run into other problems (e.g. alias analysis) if the data layout lies about the size of a load or store.
>
>> Or is it the backend’s responsibility to try and work with this?
>
> Where are these loads coming from? x86 without AVX512 doesn't have any convenient way generate code for a <32 x i1> store, but it doesn't matter because frontends don't generate <N x i1> loads and stores.
>
> If you have a frontend which is generating loads and stores like this, you could probably change it to use some other sequence (like a platform-specific intrinsic, or some sequence involving sext/trunc).
Why not just generating the code with the proper storage? If <32 x i1> are used where the storage is <32 x i8> (for example), it seems a bad idea to lie to the IR and hide it with platform-specific intrinsic, right? I fear this would cause other problem down the line in the optimizer.
—
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161117/9eef4de4/attachment.html>
More information about the llvm-dev
mailing list