[llvm-dev] [LLVM IR] Possible compiler bug: <N x i1> vector instructions ordering causing different results
Ginger Bill via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 15 01:54:58 PDT 2016
I have been using LLVM as a backend for my compiler (I'm not using the
LLVM libraries but my own to generate the necessary IR for numerous
reasons). At the moment, I am implementing vector operations.
Comparisons of vectors emit vectors of booleans|<N x i1>|and these are
causing me problems.
To access vector elements, I have been
using|extractelement|and|insertelement|however, I am getting some weird
behaviour when I execute these instructions in a different orders. The
code examples below have the same instructions and should be logically
the same. Version 1 outputs|BAA|while Version 2 outputs|BAB|. Version 2
is the logically correct version but I cannot figure out why version 1
outputs the wrong version but has the exact same instructions, just in a
different order.
I'm suspecting this may be a compiler bug rather than a code error. It
may be due to the way vectors structured. I could not find any
documentation on what the size of a vector is but it seems that vector
elements get packed. |<N x iM>| size == (N*M+7)/8 bytes. This also
suggests why the FAQ suggests not to use |getelementptr|on vectors as
the elements may not be byte aligned.
As a workaround, is there a way to make sure a boolean vector isn't
packed where each element takes up a byte, or convert a vector of
booleans to either a vector of i8, or extra instructions to prevent this
from happening?
Regards,
Bill
--------
Version 1
|; Version 1 - Generated by my naïve SSA generator ; Outputs: BAA
(incorrect) declare i32 @putchar(i32) define void @main() { entry: %0 =
alloca <8 x i1>, align 8 ; v store <8 x i1> zeroinitializer, <8 x i1>*
%0 %1 = alloca <8 x i1>, align 8 store <8 x i1> zeroinitializer, <8 x
i1>* %1 %2 = load <8 x i1>, <8 x i1>* %1, align 8 %3 = insertelement <8
x i1> %2, i1 true, i64 0 %4 = insertelement <8 x i1> %3, i1 false, i64 1
%5 = insertelement <8 x i1> %4, i1 true, i64 2 %6 = insertelement <8 x
i1> %5, i1 false, i64 3 %7 = insertelement <8 x i1> %6, i1 true, i64 4
%8 = insertelement <8 x i1> %7, i1 false, i64 5 %9 = insertelement <8 x
i1> %8, i1 true, i64 6 %10 = insertelement <8 x i1> %9, i1 false, i64 7
store <8 x i1> %10, <8 x i1>* %0 %11 = load <8 x i1>, <8 x i1>* %0,
align 8 %12 = extractelement <8 x i1> %11, i64 0 %13 = zext i1 %12 to
i32 %14 = add i32 %13, 65 ; + 'A' %15 = call i32 @putchar(i32 %14) %16 =
load <8 x i1>, <8 x i1>* %0, align 8 %17 = extractelement <8 x i1> %16,
i64 1 %18 = zext i1 %17 to i32 %19 = add i32 %18, 65 ; + 'A' %20 = call
i32 @putchar(i32 %19) %21 = load <8 x i1>, <8 x i1>* %0, align 8 %22 =
extractelement <8 x i1> %21, i64 2 %23 = zext i1 %22 to i32 %24 = add
i32 %23, 65 ; + 'A' %25 = call i32 @putchar(i32 %24) %26 = call i32
@putchar(i32 10) ; \n ret void } |
------------------------------------------------------------------------
Version 2
|; Version 2 - Manually modified version of Version 1 ; Outputs: BAB
(correct) declare i32 @putchar(i32) define void @main() { entry: %0 =
alloca <8 x i1>, align 8 ; v store <8 x i1> zeroinitializer, <8 x i1>*
%0 %1 = alloca <8 x i1>, align 8 store <8 x i1> zeroinitializer, <8 x
i1>* %1 %2 = load <8 x i1>, <8 x i1>* %1, align 8 %3 = insertelement <8
x i1> %2, i1 true, i64 0 %4 = insertelement <8 x i1> %3, i1 false, i64 1
%5 = insertelement <8 x i1> %4, i1 true, i64 2 %6 = insertelement <8 x
i1> %5, i1 false, i64 3 %7 = insertelement <8 x i1> %6, i1 true, i64 4
%8 = insertelement <8 x i1> %7, i1 false, i64 5 %9 = insertelement <8 x
i1> %8, i1 true, i64 6 %10 = insertelement <8 x i1> %9, i1 false, i64 7
store <8 x i1> %10, <8 x i1>* %0 %11 = load <8 x i1>, <8 x i1>* %0,
align 8 %12 = load <8 x i1>, <8 x i1>* %0, align 8 %13 = load <8 x i1>,
<8 x i1>* %0, align 8 %14 = extractelement <8 x i1> %11, i64 0 %15 =
extractelement <8 x i1> %12, i64 1 %16 = extractelement <8 x i1> %13,
i64 2 %17 = zext i1 %14 to i32 %18 = zext i1 %15 to i32 %19 = zext i1
%16 to i32 %20 = add i32 %17, 65 ; + 'A' %21 = add i32 %18, 65 ; + 'A'
%22 = add i32 %19, 65 ; + 'A' %23 = call i32 @putchar(i32 %20) %24 =
call i32 @putchar(i32 %21) %25 = call i32 @putchar(i32 %22) %26 = call
i32 @putchar(i32 10) ; \n ret void } |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160815/c5b9d97e/attachment-0001.html>
More information about the llvm-dev
mailing list