[PATCH][X86] Explicitly set FeatureSlowSHLD for 'bdver3'. Also make explicit that bdver* cpus enable FeatureAVX and FeatureSSE4A.

Quentin Colombet qcolombet at apple.com
Tue Nov 4 11:29:45 PST 2014


Hi Andrea,

LGTM.

Thanks,
-Quentin

On Nov 4, 2014, at 10:29 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:

> Hi Craig, Quentin (and all),
> 
> This patch improves the tablegen descriptions for some AMD cpus like
> 'Piledriver', 'Steamroller' and 'Excavator'.
> 
> In particular, this patch adds 'FeatureSlowSHLD' to 'bdver3'.
> According to the official AMD optimization guide for amdfam15: "Using
> alternative code in place of SHLD achieves lower overall latency and
> requires fewer execution resources. The 32-bit and 64-bit forms of
> ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
> instructions, while SHLD is a VectorPath instruction."
> 
> This patch also explicitly adds AVX and SSE4Ato all the AMD bdver*
> cpu's. This part of the patch is a non-functional change since
> features XOP and FMA4 already imply AVX and SSE4A.
> However (mainly for clarity reason), I wanted to make more explicit
> the fact that certain targets have those features. So that the reader
> doesn't have look at the feature list and see that "XOP implies FMA4
> which implies AVX". There already seem to be precedent for this
> approach (see for example btver2 where both FeatureF16C and FeatureAVX
> are explicitly specified).
> 
> Please let me know what you think.
> 
> Thanks,
> Andrea
> <patch-amd-cpus.diff>





More information about the llvm-commits mailing list