[PATCH] [IR] Make {extract, insert}element accept an index of any integer type.

Chandler Carruth chandlerc at google.com
Sat Apr 26 20:14:10 PDT 2014


Setting aside any correctness or other concerns, it seems a waste for a
frontend to have to change the width of the integer at all, and it would
seem really easy for LLVM to not care about the width of the integer at all.


On Sat, Apr 26, 2014 at 8:09 PM, Michael Spencer <bigcheesegs at gmail.com>wrote:

> On Sat, Apr 26, 2014 at 7:55 PM, Chris Lattner <clattner at apple.com> wrote:
> >
> > On Apr 25, 2014, at 7:17 PM, Michael Spencer <bigcheesegs at gmail.com>
> wrote:
> >
> >> Given the following C code llvm currently generates suboptimal code for
> >> x86-64:
> >>
> >> __m128 bss4( const __m128 *ptr, size_t i, size_t j )
> >> {
> >>    float f = ptr[i][j];
> >>    return (__m128) { f, f, f, f };
> >> }
> >>
> >> =================================================
> >>
> >> define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly
> %ptr, i64 %i, i64 %j) #0 {
> >>  %a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
> >>  %a2 = load <4 x float>* %a1, align 16, !tbaa !1
> >>  %a3 = trunc i64 %j to i32
> >>  %a4 = extractelement <4 x float> %a2, i32 %a3
> >>  %a5 = insertelement <4 x float> undef, float %a4, i32 0
> >>  %a6 = insertelement <4 x float> %a5, float %a4, i32 1
> >>  %a7 = insertelement <4 x float> %a6, float %a4, i32 2
> >>  %a8 = insertelement <4 x float> %a7, float %a4, i32 3
> >>  ret <4 x float> %a8
> >> }
> >>
> >> =================================================
> >>
> >>        shlq    $4, %rsi
> >>        addq    %rdi, %rsi
> >>        movslq  %edx, %rax
> >>        vbroadcastss    (%rsi,%rax,4), %xmm0
> >>        retq
> >>
> >> =================================================
> >>
> >> The movslq is uneeded, but is present because of the trunc to i32 and
> then
> >> sext back to i64 that the backend adds for vbroadcastss.
> >>
> >> We can't remove it because it changes the meaning.
> >
> > How does it change meaning?  Only the low 2 bits of the index are
> meaningful, if any of the other ones are set, you get an undefined result.
> >
> > -Chris
>
> if %j has any bits > 32 set to 1, the trunc removes them. At the
> IR/SDag level we don't know why the trunc is there (the intent may be
> to mask out the top bits), so we can't remove it.
>
> - Michael Spencer
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140426/1ce9db3a/attachment.html>


More information about the llvm-commits mailing list