[Libclc-dev] [PATCH] Fix vload3/vstore3 to emit only one IR load

Matt Arsenault via Libclc-dev libclc-dev at lists.llvm.org
Fri Sep 25 13:43:41 PDT 2015


I found out from here how to finally do this correctly:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150921/301818.html

You can combine ext_vector_type and the aligned to get a load of the
right vector type with the correct alignment. With this, the IR
for vload3 looks like:

     define <3 x i32> @81(i32 %offset, i32* nocapture readonly %x) #0 {
     entry:
       %mul = mul i32 %offset, 3
       %arrayidx = getelementptr inbounds i32, i32* %x, i32 %mul
       %castToVec4 = bitcast i32* %arrayidx to <4 x i32>*
       %loadVec4 = load <4 x i32>, <4 x i32>* %castToVec4, align 4
       %extractVec = shufflevector <4 x i32> %loadVec4, <4 x i32> 
%undef, <3 x i32> <i32 0, i32 1, i32 2>
       ret <3 x i32> %extractVec
     }

The load of <4 x i32> instead of <3 x i32> is somewhat surprising to me,
but this is much better than the previous mess from doing a load of
the first 2 components, a separate load of the 3rd and a sequence
to recombine them.

Old:

   define <3 x i32> @81(i32 %offset, i32* nocapture readonly %x) #0 {
   entry:
     %mul = mul i32 %offset, 3
     %arrayidx = getelementptr inbounds i32, i32* %x, i32 %mul
     %0 = bitcast i32* %arrayidx to <2 x i32>*
     %1 = load <2 x i32>, <2 x i32>* %0, align 4, !tbaa !1
     %2 = extractelement <2 x i32> %1, i32 0
     %3 = insertelement <3 x i32> undef, i32 %2, i32 0
     %4 = extractelement <2 x i32> %1, i32 1
     %5 = insertelement <3 x i32> %3, i32 %4, i32 1
     %add = add i32 %mul, 2
     %arrayidx3 = getelementptr inbounds i32, i32* %x, i32 %add
     %6 = load i32, i32* %arrayidx3, align 4, !tbaa !6
     %7 = insertelement <3 x i32> %5, i32 %6, i32 2
     ret <3 x i32> %7
   }
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-vload3-vstore3-to-emit-only-one-IR-load.patch
Type: text/x-diff
Size: 5674 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20150925/365352ac/attachment.patch>


More information about the Libclc-dev mailing list