[PATCH] D14901: [X86][SSE] Improve i16 splatting shuffles

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 6 11:37:43 PST 2016


Hi Simon,

> On Dec 21, 2015, at 1:07 PM, Simon Pilgrim <llvm-dev at redking.me.uk> wrote:
> 
> RKSimon added a comment.
> 
> Sorry Quentin - I missed your follow up email to the list - copied here:
> 
>>> Tested on Jaguar CPU:
> 
>> 
> 
>>> 
> 
>> 
> 
>>> Throughput: 
> 
>> 
> 
>>> Old 3op shuffle: 4cy
> 
>> 
> 
>>> New 2op shuffle: 2cy
> 
>> 
> 
>>> pshufb_rr        3cy
> 
>> 
> 
>>> pshufb_rm        3cy
> 
>> 
> 
>> 
> 
>> I am confused.
> 
>> When the code sequence is shorter, I was expecting this, but this number is not for the problem we were discussing, i.e., when the shufb is replaced by 2 shuf(w|hd, whatever), right?
> 
>> If it is, I am missing something because it should be 2 uops in both cases.
> 
> 
> I'm confused too - I'm not certain what outstanding problem with my patch you think I should be addressing.
> 
> What it does is improve vXi16 shuffles so that more patterns can be performed in 2uops instead of 3uops, a side effect of which is that a later combine stage in PerformShuffleCombine (combineX86ShufflesRecursively) no longer merges these into a single PSHUFB as its threshold for combining is 3uops. The timing tests I did demonstrated that this threshold is probably about right - although I accept that more recent targets can perform PSHUFB faster.
> 
> What am I missing?

I was wondering whether we should change the threshold for the combine stage to avoid the longer code sequence when the target can perform PSHUFB faster. Indeed, according to Agner’s instructions tables, it seems to me we are regressing the throughput on recent intel architectures.
I haven’t made any measurements though.

If you don’t have any recent intel architectures available, could you share some synthetic benchmarks when the combine does not kick in anymore?

Thanks,
-Quentin


> 
> 
> Repository:
>  rL LLVM
> 
> http://reviews.llvm.org/D14901
> 
> 
> 



More information about the llvm-commits mailing list