[PATCH][X86] Explicitly set FeatureSlowSHLD for 'bdver3'. Also make explicit that bdver* cpus enable FeatureAVX and FeatureSSE4A.

Andrea Di Biagio andrea.dibiagio at gmail.com
Tue Nov 4 13:30:02 PST 2014


Thanks Quentin!
Committed revision 221296.

On Tue, Nov 4, 2014 at 7:29 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Andrea,
>
> LGTM.
>
> Thanks,
> -Quentin
>
> On Nov 4, 2014, at 10:29 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
>> Hi Craig, Quentin (and all),
>>
>> This patch improves the tablegen descriptions for some AMD cpus like
>> 'Piledriver', 'Steamroller' and 'Excavator'.
>>
>> In particular, this patch adds 'FeatureSlowSHLD' to 'bdver3'.
>> According to the official AMD optimization guide for amdfam15: "Using
>> alternative code in place of SHLD achieves lower overall latency and
>> requires fewer execution resources. The 32-bit and 64-bit forms of
>> ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
>> instructions, while SHLD is a VectorPath instruction."
>>
>> This patch also explicitly adds AVX and SSE4Ato all the AMD bdver*
>> cpu's. This part of the patch is a non-functional change since
>> features XOP and FMA4 already imply AVX and SSE4A.
>> However (mainly for clarity reason), I wanted to make more explicit
>> the fact that certain targets have those features. So that the reader
>> doesn't have look at the feature list and see that "XOP implies FMA4
>> which implies AVX". There already seem to be precedent for this
>> approach (see for example btver2 where both FeatureF16C and FeatureAVX
>> are explicitly specified).
>>
>> Please let me know what you think.
>>
>> Thanks,
>> Andrea
>> <patch-amd-cpus.diff>
>



More information about the llvm-commits mailing list