[PATCH] [IR] Make {extract, insert}element accept an index of any integer type.

Chris Lattner clattner at apple.com
Sat Apr 26 19:55:22 PDT 2014


On Apr 25, 2014, at 7:17 PM, Michael Spencer <bigcheesegs at gmail.com> wrote:

> Given the following C code llvm currently generates suboptimal code for
> x86-64:
> 
> __m128 bss4( const __m128 *ptr, size_t i, size_t j )
> {
>    float f = ptr[i][j];
>    return (__m128) { f, f, f, f };
> }
> 
> =================================================
> 
> define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
>  %a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
>  %a2 = load <4 x float>* %a1, align 16, !tbaa !1
>  %a3 = trunc i64 %j to i32
>  %a4 = extractelement <4 x float> %a2, i32 %a3
>  %a5 = insertelement <4 x float> undef, float %a4, i32 0
>  %a6 = insertelement <4 x float> %a5, float %a4, i32 1
>  %a7 = insertelement <4 x float> %a6, float %a4, i32 2
>  %a8 = insertelement <4 x float> %a7, float %a4, i32 3
>  ret <4 x float> %a8
> }
> 
> =================================================
> 
>        shlq    $4, %rsi
>        addq    %rdi, %rsi
>        movslq  %edx, %rax
>        vbroadcastss    (%rsi,%rax,4), %xmm0
>        retq
> 
> =================================================
> 
> The movslq is uneeded, but is present because of the trunc to i32 and then
> sext back to i64 that the backend adds for vbroadcastss.
> 
> We can't remove it because it changes the meaning.

How does it change meaning?  Only the low 2 bits of the index are meaningful, if any of the other ones are set, you get an undefined result.

-Chris



More information about the llvm-commits mailing list