[llvm] r262397 - DAGCombiner: Turn truncate of a bitcasted vector to an extract
Mikael Holmén via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 25 07:49:59 PDT 2016
Hi Matt,
If I run llc
llc bug.ll -o bug.s
on this
target datalayout =
"E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v128:128:128-n32:64"
target triple = "powerpc64-unknown-linux-gnu"
%rec9 = type { [4 x i8] }
@g_cm_s = global %rec9 { [4 x i8] [i8 111, i8 112, i8 113, i8 114] }
declare void @__fail()
define i8 @foo() {
%1 = load <4 x i8>, <4 x i8> * bitcast (%rec9 * @g_cm_s to <4 x i8> *)
%2 = bitcast <4 x i8> %1 to i32
%3 = trunc i32 %2 to i8
%4 = icmp eq i8 %3, 114
br i1 %4, label %bb2, label %bb1
bb1: ; preds = %0
call void @__fail()
br label %bb2
bb2: ; preds = %0, %bb1
ret i8 0
bb3: ; No predecessors!
ret i8 0
}
with an llc without your change I get
# BB#0:
addis 3, 2, g_cm_s at toc@ha
addi 3, 3, g_cm_s at toc@l
lbz 3, 3(3)
cmplwi 3, 114
beq 0, .LBB0_2
Notice the
lbz 3, 3(3)
Then if I compile it with your change I get
# BB#0:
addis 3, 2, g_cm_s at toc@ha
addi 3, 3, g_cm_s at toc@l
lbz 3, 0(3)
cmplwi 3, 114
beq 0, .LBB0_2
Notice the
lbz 3, 0(3)
So you get different result on PPC with and without your change. This
indicates a bug to me.
Then finally if I compile with your change plus my own change I get
# BB#0:
addis 3, 2, g_cm_s at toc@ha
addi 3, 3, g_cm_s at toc@l
lbz 3, 3(3)
cmplwi 3, 114
beq 0, .LBB0_2
which is the same as the original code.
Regards,
Mikael
On 04/25/2016 05:06 AM, Matt Arsenault wrote:
>
>> On Mar 4, 2016, at 05:38, Mikael Holmén <mikael.holmen at ericsson.com> wrote:
>>
>> Hi,
>>
>> On 03/04/2016 02:33 AM, Matt Arsenault wrote:
>>>
>>>> On Mar 3, 2016, at 00:27, Mikael Holmén via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>>>
>>>> Hi Matt,
>>>>
>>>> What about Big Endian targets? Shouldn't we extract the highest vector element instead of element 0 then?
>>>>
>>>> Regards,
>>>> Mikael
>>>
>>> I don’t know how vectors types work on big endian targets
>>
>> Me neither. :D
>>
>> But one case I've seen for my big-endian out-of-tree target is that we have:
>>
>> @g_cm_s = addrspace(21) global %rec802 { [4 x i16] [i16 111, i16 112, i16 113, i16 114] }
>>
>> and then:
>>
>> %1 = load <4 x i16>, <4 x i16> addrspace(21)* bitcast (%rec802 addrspace(21)* @g_cm_s to <4 x i16> addrspace(21)*)
>> %2 = bitcast <4 x i16> %1 to i64
>> %3 = trunc i64 %2 to i16
>> %_tmp7 = icmp eq i16 %3, 114
>>
>> Without the new optimization in visitTRUNCATE this code works well for me. The result of the trunc is 114 as expected, but with your change we get 111.
>>
>> So I've changed
>>
>> +
>> + // We need to consider endianness when deciding which vector
>> + // element to extract.
>> + unsigned ElmtIdx =
>> + DAG.getDataLayout().isBigEndian()
>> + ? SrcVT.getVectorNumElements() - 1
>> + : 0;
>> return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, VT,
>> - VecSrc, DAG.getConstant(0, SL, IdxVT));
>> + VecSrc, DAG.getConstant(ElmtIdx, SL, IdxVT));
>>
>> locally to get my test to pass.
>>
>> I've no idea if there are any big-endian in-tree targets that has vectors where this can be an issue.
>>
>> /Mikael
>>
>>>
>>> -Matt
>>>
>>
>
>
> This might be the case on PPC? Can you try to write a test for that?
>
> -Matt
>
More information about the llvm-commits
mailing list