ARM Cast Cost Table
Arnold Schwaighofer
aschwaighofer at apple.com
Tue Jan 29 11:20:51 PST 2013
+1 for smaller tests. To verify that the cost model returns the right cost you can write something like:
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:64:128-a0:0:64-n32-S64"
target triple = "armv7--linux-gnueabihf"
%T0 = type <4 x i16>
%T1 = type <4 x i32>
define void @func0(%T0* %loadaddr, %T1* %storeaddr) {
%v0 = load %T0* %loadaddr
%r = sext %T0 %v0 to %T1
store %T1 %r, %T1* %storeaddr
ret void
}
%T2 = type <4 x i16>
%T3 = type <4 x i32>
define void @func1(%T2* %loadaddr, %T2* %loadaddr2, %T3* %storeaddr) {
%v0 = load %T2* %loadaddr
%v1 = load %T2* %loadaddr2
%r = sext %T2 %v0 to %T3
%r2 = sext %T2 %v1 to %T3
%r3 = mul %T3 %r, %r2
store %T3 %r3, %T3* %storeaddr
ret void
}
Now, I choose those two examples for a reason. If we run
> llc -mcpu=cortex-a9 < x.ll
We get:
func0: @ @func0
@ BB#0:
vldr d16, [r0]
vmovl.s16 q8, d16
vst1.64 {d16, d17}, [r1]
bx lr
func1: @ @func1
@ BB#0:
vldr d16, [r1]
vldr d17, [r0]
vmull.s16 q8, d17, d16
vst1.64 {d16, d17}, [r2]
bx lr
In the case of func0 we pay for sign-extension while in func1 it can be merged into the arithmetic. The question is now which cost should we return: the optimistic or the pessimistic one? I would lean towards the optimistic one (as you do) but I think we should get an agreement on which one to use in such cases.
Thanks,
Arnold
On Jan 29, 2013, at 12:46 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Renato,
>
> Thanks for working on this. I have some comments.
>
> The changes to ARMTargetMachine.h are unrelated to the cost model. Lets commit them in a separate patch.
>
> The code in ARMTTI::getCastInstrCos looks good.
>
> + ISD, DstTy.getSimpleVT(), SrcTy.getSimpleVT());
>
> Did we pass the 80-col ? I am not sure.
>
> In your test cases you execute both LLC and OPT. If you are checking the LLC generates the right pattern then this test should be in tests/CodeGen/ARM/. Can you make the tests smaller ? You can write a two-line function in LL that takes the arguments and performs the operation on it.
>
> Thanks,
> Nadav
>
>
> On Jan 29, 2013, at 10:32 AM, Renato Golin <renato.golin at linaro.org> wrote:
>
>> Hi Nadav,
>>
>> This is an entry level change, just to make sure everything is in the right place and the infrastructure is ready. The code change is trivial.
>>
>> http://llvm-reviews.chandlerc.com/D345
>>
>> I spent a bit more time on tests, to make sure the costs were correctly taken and exposing the instruction I expect to do the cast clearly stated.
>>
>> Both tests refer to the same source code, but one is vectorized and the other is not. With time, we can fill them with cost checks for other instructions, I just didn't do it because I wasn't sure they were correct (some I know aren't).
>>
>> cheers,
>> --renato
>> <cast-cost.patch>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130129/3a6a6678/attachment.html>
More information about the llvm-commits
mailing list