[PATCH][X86] Explicitly set FeatureSlowSHLD for 'bdver3'. Also make explicit that bdver* cpus enable FeatureAVX and FeatureSSE4A.
Quentin Colombet
qcolombet at apple.com
Tue Nov 4 11:29:45 PST 2014
Hi Andrea,
LGTM.
Thanks,
-Quentin
On Nov 4, 2014, at 10:29 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Hi Craig, Quentin (and all),
>
> This patch improves the tablegen descriptions for some AMD cpus like
> 'Piledriver', 'Steamroller' and 'Excavator'.
>
> In particular, this patch adds 'FeatureSlowSHLD' to 'bdver3'.
> According to the official AMD optimization guide for amdfam15: "Using
> alternative code in place of SHLD achieves lower overall latency and
> requires fewer execution resources. The 32-bit and 64-bit forms of
> ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
> instructions, while SHLD is a VectorPath instruction."
>
> This patch also explicitly adds AVX and SSE4Ato all the AMD bdver*
> cpu's. This part of the patch is a non-functional change since
> features XOP and FMA4 already imply AVX and SSE4A.
> However (mainly for clarity reason), I wanted to make more explicit
> the fact that certain targets have those features. So that the reader
> doesn't have look at the feature list and see that "XOP implies FMA4
> which implies AVX". There already seem to be precedent for this
> approach (see for example btver2 where both FeatureF16C and FeatureAVX
> are explicitly specified).
>
> Please let me know what you think.
>
> Thanks,
> Andrea
> <patch-amd-cpus.diff>
More information about the llvm-commits
mailing list