[PATCH][AArch64] Prefer ldp x, x to ldr q

Fri Aug 1 09:52:29 PDT 2014

Hi Tim,

"all heuristics are wrong, but some are useful"? ;)

I just extended it to work with all 128-bit loads, and that caused some bad behaviour.

What we're saying is (assume one scalar load has a cost of 1):
<2 x i32> costs 1
<4 x i32> costs 4

That gives a punitive cost for <4 x i32> which is incorrect.

The only thing we're attempting to say is "there is no advantage to <2 x i64> over 2 i64 loads". The VF of 2 is important, as only a VF of 2 can be lowered with an LDP instruction. On AArch32 we could have packed values together and loaded <4 x i32> with a VLDRD, but of course that doesn't work on AArch64.

So I think it only applies to <2 x i64> or <2 x double>. And yes, this whole thing is making me feel very dirty inside - if there's a better way, I don't know of it :(

Cheers,

James 

-----Original Message-----
From: Tim Northover [mailto:t.p.northover at gmail.com] 
Sent: 29 July 2014 13:31
To: James Molloy
Cc: Chad Rosier; Tim Northover; llvm-commits
Subject: Re: [PATCH][AArch64] Prefer ldp x, x to ldr q

> While this is a slight fudge, I don't see it as a hack personally.

A "heuristic" then (scare-quotes optional)? That covers a multitude of sins.

Personally, I'd still say a hack, but that the entire function is a hack once you start thinking about ldp so nothing else is possible without reworking.

So I reckon the approach is OK for now, once it's made more generic (i.e. it should probably return the same thing for all 128-bit ops).

Cheers.

Tim.