[llvm-dev] [LLVM IR] Possible compiler bug: <N x i1> vector instructions ordering causing different results

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 15 13:49:31 PDT 2016


Hi Bill, 

I highly recommend that you use only vectors of elements which have a size which is a whole number of bytes. There are known issues with how we handle the more-general cases, see: 

https://llvm.org/bugs/show_bug.cgi?id=1784 
https://llvm.org/bugs/show_bug.cgi?id=22603 
https://llvm.org/bugs/show_bug.cgi?id=27600 

In short, different parts of the compiler disagree on whether <8 x i1> is one or eight bytes long, and some parts do nonsensical things altogether. There are a limited subset of cases where the current infrastructure works well (mostly for handling vectors of i1 for vectorized comparisons), but if you stray too far you'll run into problems. That having been said, we would like to fix these things, and so if you find problems, please do file bug reports about them. 

-Hal 

----- Original Message -----

> From: "Ginger Bill via llvm-dev" <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Monday, August 15, 2016 3:54:58 AM
> Subject: [llvm-dev] [LLVM IR] Possible compiler bug: <N x i1> vector
> instructions ordering causing different results

> I have been using LLVM as a backend for my compiler (I'm not using
> the LLVM libraries but my own to generate the necessary IR for
> numerous reasons). At the moment, I am implementing vector
> operations. Comparisons of vectors emit vectors of booleans <N x i1>
> and these are causing me problems.
> To access vector elements, I have been using extractelement and
> insertelement however, I am getting some weird behaviour when I
> execute these instructions in a different orders. The code examples
> below have the same instructions and should be logically the same.
> Version 1 outputs BAA while Version 2 outputs BAB . Version 2 is the
> logically correct version but I cannot figure out why version 1
> outputs the wrong version but has the exact same instructions, just
> in a different order.

> I'm suspecting this may be a compiler bug rather than a code error.
> It may be due to the way vectors structured. I could not find any
> documentation on what the size of a vector is but it seems that
> vector elements get packed. <N x iM> size == (N*M+7)/8 bytes. This
> also suggests why the FAQ suggests not to use getelementptr on
> vectors as the elements may not be byte aligned.

> As a workaround, is there a way to make sure a boolean vector isn't
> packed where each element takes up a byte, or convert a vector of
> booleans to either a vector of i8, or extra instructions to prevent
> this from happening?

> Regards,
> Bill
> --------
> Version 1
> ; Version 1 - Generated by my naïve SSA generator
> ; Outputs: BAA (incorrect)
> declare i32 @putchar(i32)

> define void @main() {
> entry:
> %0 = alloca <8 x i1>, align 8 ; v
> store <8 x i1> zeroinitializer, <8 x i1>* %0
> %1 = alloca <8 x i1>, align 8
> store <8 x i1> zeroinitializer, <8 x i1>* %1
> %2 = load <8 x i1>, <8 x i1>* %1, align 8
> %3 = insertelement <8 x i1> %2, i1 true, i64 0
> %4 = insertelement <8 x i1> %3, i1 false, i64 1
> %5 = insertelement <8 x i1> %4, i1 true, i64 2
> %6 = insertelement <8 x i1> %5, i1 false, i64 3
> %7 = insertelement <8 x i1> %6, i1 true, i64 4
> %8 = insertelement <8 x i1> %7, i1 false, i64 5
> %9 = insertelement <8 x i1> %8, i1 true, i64 6
> %10 = insertelement <8 x i1> %9, i1 false, i64 7
> store <8 x i1> %10, <8 x i1>* %0

> %11 = load <8 x i1>, <8 x i1>* %0, align 8
> %12 = extractelement <8 x i1> %11, i64 0
> %13 = zext i1 %12 to i32
> %14 = add i32 %13, 65 ; + 'A'
> %15 = call i32 @putchar(i32 %14)

> %16 = load <8 x i1>, <8 x i1>* %0, align 8
> %17 = extractelement <8 x i1> %16, i64 1
> %18 = zext i1 %17 to i32
> %19 = add i32 %18, 65 ; + 'A'
> %20 = call i32 @putchar(i32 %19)

> %21 = load <8 x i1>, <8 x i1>* %0, align 8
> %22 = extractelement <8 x i1> %21, i64 2
> %23 = zext i1 %22 to i32
> %24 = add i32 %23, 65 ; + 'A'
> %25 = call i32 @putchar(i32 %24)

> %26 = call i32 @putchar(i32 10) ; \n

> ret void
> }

> Version 2
> ; Version 2 - Manually modified version of Version 1
> ; Outputs: BAB (correct)
> declare i32 @putchar(i32)

> define void @main() {
> entry:
> %0 = alloca <8 x i1>, align 8 ; v
> store <8 x i1> zeroinitializer, <8 x i1>* %0
> %1 = alloca <8 x i1>, align 8
> store <8 x i1> zeroinitializer, <8 x i1>* %1
> %2 = load <8 x i1>, <8 x i1>* %1, align 8
> %3 = insertelement <8 x i1> %2, i1 true, i64 0
> %4 = insertelement <8 x i1> %3, i1 false, i64 1
> %5 = insertelement <8 x i1> %4, i1 true, i64 2
> %6 = insertelement <8 x i1> %5, i1 false, i64 3
> %7 = insertelement <8 x i1> %6, i1 true, i64 4
> %8 = insertelement <8 x i1> %7, i1 false, i64 5
> %9 = insertelement <8 x i1> %8, i1 true, i64 6
> %10 = insertelement <8 x i1> %9, i1 false, i64 7
> store <8 x i1> %10, <8 x i1>* %0

> %11 = load <8 x i1>, <8 x i1>* %0, align 8
> %12 = load <8 x i1>, <8 x i1>* %0, align 8
> %13 = load <8 x i1>, <8 x i1>* %0, align 8

> %14 = extractelement <8 x i1> %11, i64 0
> %15 = extractelement <8 x i1> %12, i64 1
> %16 = extractelement <8 x i1> %13, i64 2

> %17 = zext i1 %14 to i32
> %18 = zext i1 %15 to i32
> %19 = zext i1 %16 to i32

> %20 = add i32 %17, 65 ; + 'A'
> %21 = add i32 %18, 65 ; + 'A'
> %22 = add i32 %19, 65 ; + 'A'

> %23 = call i32 @putchar(i32 %20)
> %24 = call i32 @putchar(i32 %21)
> %25 = call i32 @putchar(i32 %22)

> %26 = call i32 @putchar(i32 10) ; \n

> ret void
> }

> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160815/2a49735c/attachment.html>


More information about the llvm-dev mailing list