[llvm-dev] [LLVM IR] Possible compiler bug: <N x i1> vector instructions ordering causing different results

Mon Aug 15 01:54:58 PDT 2016

I have been using LLVM as a backend for my compiler (I'm not using the 
LLVM libraries but my own to generate the necessary IR for numerous 
reasons). At the moment, I am implementing vector operations. 
Comparisons of vectors emit vectors of booleans|<N x i1>|and these are 
causing me problems.

To access vector elements, I have been 
using|extractelement|and|insertelement|however, I am getting some weird 
behaviour when I execute these instructions in a different orders. The 
code examples below have the same instructions and should be logically 
the same. Version 1 outputs|BAA|while Version 2 outputs|BAB|. Version 2 
is the logically correct version but I cannot figure out why version 1 
outputs the wrong version but has the exact same instructions, just in a 
different order.

I'm suspecting this may be a compiler bug rather than a code error. It 
may be due to the way vectors structured. I could not find any 
documentation on what the size of a vector is but it seems that vector 
elements get packed. |<N x iM>| size == (N*M+7)/8 bytes. This also 
suggests why the FAQ suggests not to use |getelementptr|on vectors as 
the elements may not be byte aligned.

As a workaround, is there a way to make sure a boolean vector isn't 
packed where each element takes up a byte, or convert a vector of 
booleans to either a vector of i8, or extra instructions to prevent this 
from happening?

Regards,

Bill

--------

    Version 1

|; Version 1 - Generated by my naïve SSA generator ; Outputs: BAA 
(incorrect) declare i32 @putchar(i32) define void @main() { entry: %0 = 
alloca <8 x i1>, align 8 ; v store <8 x i1> zeroinitializer, <8 x i1>* 
%0 %1 = alloca <8 x i1>, align 8 store <8 x i1> zeroinitializer, <8 x 
i1>* %1 %2 = load <8 x i1>, <8 x i1>* %1, align 8 %3 = insertelement <8 
x i1> %2, i1 true, i64 0 %4 = insertelement <8 x i1> %3, i1 false, i64 1 
%5 = insertelement <8 x i1> %4, i1 true, i64 2 %6 = insertelement <8 x 
i1> %5, i1 false, i64 3 %7 = insertelement <8 x i1> %6, i1 true, i64 4 
%8 = insertelement <8 x i1> %7, i1 false, i64 5 %9 = insertelement <8 x 
i1> %8, i1 true, i64 6 %10 = insertelement <8 x i1> %9, i1 false, i64 7 
store <8 x i1> %10, <8 x i1>* %0 %11 = load <8 x i1>, <8 x i1>* %0, 
align 8 %12 = extractelement <8 x i1> %11, i64 0 %13 = zext i1 %12 to 
i32 %14 = add i32 %13, 65 ; + 'A' %15 = call i32 @putchar(i32 %14) %16 = 
load <8 x i1>, <8 x i1>* %0, align 8 %17 = extractelement <8 x i1> %16, 
i64 1 %18 = zext i1 %17 to i32 %19 = add i32 %18, 65 ; + 'A' %20 = call 
i32 @putchar(i32 %19) %21 = load <8 x i1>, <8 x i1>* %0, align 8 %22 = 
extractelement <8 x i1> %21, i64 2 %23 = zext i1 %22 to i32 %24 = add 
i32 %23, 65 ; + 'A' %25 = call i32 @putchar(i32 %24) %26 = call i32 
@putchar(i32 10) ; \n ret void } |

------------------------------------------------------------------------

    Version 2

|; Version 2 - Manually modified version of Version 1 ; Outputs: BAB 
(correct) declare i32 @putchar(i32) define void @main() { entry: %0 = 
alloca <8 x i1>, align 8 ; v store <8 x i1> zeroinitializer, <8 x i1>* 
%0 %1 = alloca <8 x i1>, align 8 store <8 x i1> zeroinitializer, <8 x 
i1>* %1 %2 = load <8 x i1>, <8 x i1>* %1, align 8 %3 = insertelement <8 
x i1> %2, i1 true, i64 0 %4 = insertelement <8 x i1> %3, i1 false, i64 1 
%5 = insertelement <8 x i1> %4, i1 true, i64 2 %6 = insertelement <8 x 
i1> %5, i1 false, i64 3 %7 = insertelement <8 x i1> %6, i1 true, i64 4 
%8 = insertelement <8 x i1> %7, i1 false, i64 5 %9 = insertelement <8 x 
i1> %8, i1 true, i64 6 %10 = insertelement <8 x i1> %9, i1 false, i64 7 
store <8 x i1> %10, <8 x i1>* %0 %11 = load <8 x i1>, <8 x i1>* %0, 
align 8 %12 = load <8 x i1>, <8 x i1>* %0, align 8 %13 = load <8 x i1>, 
<8 x i1>* %0, align 8 %14 = extractelement <8 x i1> %11, i64 0 %15 = 
extractelement <8 x i1> %12, i64 1 %16 = extractelement <8 x i1> %13, 
i64 2 %17 = zext i1 %14 to i32 %18 = zext i1 %15 to i32 %19 = zext i1 
%16 to i32 %20 = add i32 %17, 65 ; + 'A' %21 = add i32 %18, 65 ; + 'A' 
%22 = add i32 %19, 65 ; + 'A' %23 = call i32 @putchar(i32 %20) %24 = 
call i32 @putchar(i32 %21) %25 = call i32 @putchar(i32 %22) %26 = call 
i32 @putchar(i32 10) ; \n ret void } |

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160815/c5b9d97e/attachment-0001.html>