[LLVMbugs] [Bug 10073] New: "Cannot select" for bitcasts of AVX data types

Fri Jun 3 02:23:17 PDT 2011

http://llvm.org/bugs/show_bug.cgi?id=10073

           Summary: "Cannot select" for bitcasts of AVX data types
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: karrenberg at cs.uni-saarland.de
                CC: llvmbugs at cs.uiuc.edu

The AVX backend gets confused about mask code as e.g. produced by VCMPPS
together with mask operations and corresponding bitcasts.

Masks that are represented as <8 x i32> should be able to be modified by
xor/and/or which should get lowered to VXORPS/VANDPS/VORPS.
It could also make sense to allow these to operate on <8 x float>, matching the
C intrinsics of immintrin.h (_mm256_cmpgt_ps etc. produce __m256 instead of
__m256i, _mm256_xor_ps takes __m256 operators as well) and LLVM's own
intrinsics (llvm.x86.avx.cmp.ps.256 produces <8 x float>,
llvm.x86.avx.blendv.ps.256 takes an <8 x float> operand as condition).

Currently, code generation for most of these operations fails with "Cannot
select" messages for a cast operation, which could mean that LLVM is only
confused about the required types, not about the bit operations.

Consider these examples:

define <8 x float> @test1(<8 x float> %a, <8 x float> %b, <8 x i32> %m) 
nounwind readnone {
entry:
   %cmp = tail call <8 x float> @llvm.x86.avx.cmp.ps.256(<8 x float> %a, 
<8 x float> %b, i8 1) nounwind readnone
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float> 
%a, <8 x float> %b, <8 x float> %cmp) nounwind readnone
   ret <8 x float> %res
}

This works fine and "llc -filetype=asm -mattr=avx" produces the expected
assembly (VCMPLTPS + VBLENDVPS).

On the other hand, this does not work:

define <8 x float> @test2(<8 x float> %a, <8 x float> %b, <8 x i32> %m) 
nounwind readnone {
entry:
   %cmp = tail call <8 x float> @llvm.x86.avx.cmp.ps.256(<8 x float> %a, 
<8 x float> %b, i8 1) nounwind readnone
   %cast = bitcast <8 x float> %cmp to <8 x i32>
   %mask = and <8 x i32> %cast, %m
   %blend_cond = bitcast <8 x i32> %mask to <8 x float>
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float> 
%a, <8 x float> %b, <8 x float> %blend_cond) nounwind readnone
   ret <8 x float> %res
}

This should produce VCMPLTPS, VANDPS, BLENDVPS, while llc (2.9 final as well as
latest trunk) bails out with:

LLVM ERROR: Cannot select: 0x2510540: v8f32 = bitcast 0x2532270 [ID=16]
   0x2532270: v4i64 = and 0x2532070, 0x2532170 [ID=15]
     0x2532070: v4i64 = bitcast 0x2510740 [ID=14]
       0x2510740: v8f32 = llvm.x86.avx.cmp.ps.256 0x2510640, 0x2511340, 
0x2510f40, 0x2511140 [ORD=3] [ID=12]
...

The same counts for or and xor.
However, one specific example works:

define <8 x float> @test3(<8 x float> %a, <8 x float> %b, <8 x i32> %m) 
nounwind readnone {
entry:
   %cond = xor <8 x i32> %m, %m
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float> 
%a, <8 x float> %b, <8 x float> %cond) nounwind readnone
   ret <8 x float> %res
}

This produces the expected (VXORPS + BLENDVPS), but the same fails for and/or.
In this case, no casting is required, which indicates that this is the actual
problem, not the instruction selection of the xor.

Apparently, LLVM is generally unable to handle bitcasts between <8 x i32> and
<8 x float> (and <4 x i64> vs. <4 x double>), which should always be allowed
for AVX as nops.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.