[LLVMdev] Load value and broadcast in LLVM
Zhi Chen
zhichen1986 at gmail.com
Mon May 4 12:55:43 PDT 2015
Hi Snayay,
It is able to produce the vmovddup and vbroadcastsd instruction now if I
add the -mattr=avx option. Thanks.
Best,
Zhi
On Mon, May 4, 2015 at 12:44 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> Zhi -
>
> If your IR is not ending up as the expected splat instructions (simple AVX
> examples below), please file a bug.
>
> $ cat broadcast.ll
> define <2 x double> @v2f64(double* %d) {
> %ld = load double, double* %d
> %v = insertelement <2 x double> undef, double %ld, i32 0
> %sh = shufflevector <2 x double> %v, <2 x double> undef, <2 x i32><i32
> 0, i32 0>
> ret <2 x double> %sh
> }
>
> define <4 x double> @v4f64(double* %d) {
> %ld = load double, double* %d
> %v = insertelement <4 x double> undef, double %ld, i32 0
> %sh = shufflevector <4 x double> %v, <4 x double> undef, <4 x i32><i32
> 0, i32 0, i32 0, i32 0>
> ret <4 x double> %sh
> }
>
> $ ./llc broadcast.ll -o - -mattr=avx
> _v2f64: ## @v2f64
> vmovddup (%rdi), %xmm0 ## xmm0 = mem[0,0]
> retq
>
> _v4f64: ## @v4f64
> vbroadcastsd (%rdi), %ymm0
> retq
>
>
>
>
> On Mon, May 4, 2015 at 12:12 PM, Shahid, Asghar-ahmad <
> Asghar-ahmad.Shahid at amd.com> wrote:
>
>> Hi Zhi,
>>
>>
>>
>> At IR level, yes there is an overhead of two more instruction, however,
>> as Michel has pointed
>>
>> backend may fold it to single instruction wherever there is such an
>> instruction is available.
>>
>>
>>
>> Regards,
>>
>> Shahid
>>
>>
>>
>> *From:* zhi chen [mailto:zchenhn at gmail.com]
>> *Sent:* Monday, May 04, 2015 10:32 PM
>> *To:* Shahid, Asghar-ahmad
>> *Cc:* LLVM Dev
>> *Subject:* Re: [LLVMdev] Load value and broadcast in LLVM
>>
>>
>>
>> Hi Shahid,
>>
>>
>>
>> Thank you so much for your response. You suggested approach is what I am
>> right now using. However, it seems that the overhead is a little bit high
>> because we are introducing two more instructions. I was wondering if there
>> was a cheaper way to do it.
>>
>>
>>
>> Best,
>>
>> Zhi
>>
>>
>>
>> On Mon, May 4, 2015 at 2:12 AM, Shahid, Asghar-ahmad <
>> Asghar-ahmad.Shahid at amd.com> wrote:
>>
>> Hi Zhi,
>>
>>
>>
>> If I get your question correctly, Yes, you can do it by using the
>> IRBuilder’s CreateVectorSplat() API.
>>
>>
>>
>> /// \brief Return a vector value that contains \arg V broadcasted to \p
>>
>> /// NumElts elements.
>>
>> Value *CreateVectorSplat(unsigned NumElts, Value *V, const Twine &Name
>> = "")
>>
>>
>>
>> For your case, here the Value V will be your loaded value %0 and NumElts
>> will be 2.
>>
>>
>>
>> So after %0 = load double* %x, align 4, !tbaa !0
>>
>> you will get a sequence of LLVM-IR
>>
>>
>>
>> %1= insertelement <2 x double > %0, …
>>
>> %2= shufflevector <2 x double > %1, …
>>
>>
>>
>> %2 will be your desired value.
>>
>>
>>
>> Regards,
>>
>> Shahid
>>
>>
>>
>> *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On
>> Behalf Of *zhi chen
>> *Sent:* Monday, May 04, 2015 1:29 PM
>> *To:* LLVM Dev
>> *Subject:* [LLVMdev] Load value and broadcast in LLVM
>>
>>
>>
>> Is it possible to load a value into a vector register and broadcast it in
>> LLVM?
>>
>>
>>
>> For example, for the following address %x
>>
>>
>>
>> %x = getelementptr inbounds %struct._Ray* %ray, i32 0, i32 0, i32 0
>>
>>
>>
>> instead of loading the value at %x into a scalar register %0:
>>
>> %0 = load double* %x, align 4, !tbaa !0
>>
>>
>>
>> I want to load it into a <2 x double> vector register %1 and make both of
>> the two elements in %1 be the value at %x.
>>
>>
>>
>> I guess one way to do this is to make getelementptr return a <2 x i32>*
>> address, where the two addresses in <2 X 32> are the same. But I don't know
>> if it is possible to do this in LLVM.
>>
>>
>>
>> Any help would be appreciated.
>>
>>
>>
>> Best,
>>
>> Zhi
>>
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
--
PhD Student
Department of Computer Science
University of California, Irvine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/a61d1bf0/attachment.html>
More information about the llvm-dev
mailing list