[PATCH] D85165: [X86][MC][Target] Initial backend support a tune CPU to support -mtune

Wed Aug 12 11:25:41 PDT 2020

craig.topper added a comment.

In D85165#2213673 <https://reviews.llvm.org/D85165#2213673>, @andreadb wrote:

> In D85165#2193969 <https://reviews.llvm.org/D85165#2193969>, @craig.topper wrote:
>
>> @andreadb @RKSimon or @efriedma do any of you have suggestions for simple scheduler tests for this? I was hoping I could use -print-schedule like we used to but that no longer exists.
>
> I remember the design of the `-print-schedule` functionality was a bit problematic because it had a layering violation (see PR37160).
> The issue was introduced when support for printing scheduling info for inline assembly was added. The first version of print-schedule didn't have that problem though.
>
> Not sure if it can help but, if your goal is to obtain latency and throughput information for every instruction, then you can piple the output of llc in input to llvm-mca.
>
> You can use MCA markers around the regions of code that you want to have analyzed by mca.
>
> Example:
>
>   define void @vzeroupper(<4 x i64>* %x, <4 x i64>* %y) #0 {
>     call void asm sideeffect "# LLVM-MCA-BEGIN vzeroupper","~{dirflag},~{fpsr},~{flags}"()
>     %a = load <4 x i64>, <4 x i64>* %x
>     %b = load <4 x i64>, <4 x i64>* %y
>     %c = mul <4 x i64> %a, %b
>     store <4 x i64> %c, <4 x i64>* %x
>     call void asm sideeffect "# LLVM-MCA-END", "~{dirflag},~{fpsr},~{flags}"()
>     ret void
>   }
>
> If you now run the following command:
>
>> llc < my-vzeroupper-test.ll | llvm-mca -mcpu=skx -all-views=false -instruction-info
>
> The you should see something like this:
>
>   [0] Code Region - vzeroupper
>   
>   
>   
>   Instruction Info:
>   [1]: #uOps
>   [2]: Latency
>   [3]: RThroughput
>   [4]: MayLoad
>   [5]: MayStore
>   [6]: HasSideEffects (U)
>   
>   [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
>    1      7     0.50    *                   vmovdqa       (%rdi), %ymm0
>    4      22    1.50    *                   vpmullq       (%rsi), %ymm0, %ymm0
>    2      1     1.00           *            vmovdqa       %ymm0, (%rdi)
>
> Note however that mca doesn't allow you to specify different cpus for different code blocks. If you want to do something like that, then unfortunately you need to split your test into multiple files...
>
> That being said, you should be able to then use FileCheck and check latency/throughput values.

What I'm trying to prove is that the scheduler used for pre/post-ra scheduling is using the scheduler model for the CPU name specified in the tune-cpu attribute. So invoking a separate tool with a command line doesn't help that goal.

================
Comment at: llvm/lib/MC/MCSubtargetInfo.cpp:183-188
+
+    // If there is a match
+    if (CPUEntry) {
+      // Set the features implied by this CPU feature, if any.
+      SetImpliedBits(Bits, CPUEntry->TuneImplies.getAsBitset(), ProcFeatures);
+    } else if (TuneCPU != CPU) {
----------------
andreadb wrote:
> Maybe it has already been asked before (apologies in case), but what if these features are not really compatible with `CPU`?
> What if let say we have a crazy combination such as: -mcpu=btver2 -mtune=skx.
> Not that I expect people to write that sequence of options :-).
We're only taking feature bits like "slowUAMem16" from the tune cpu. Hopefully those bits aren't implemented in ways that are incompatible with flags for instruction legality.

================
Comment at: llvm/lib/Target/X86/X86Subtarget.cpp:235-236

+  if (TuneCPU.empty())
+    TuneCPU = "generic";
+
----------------
andreadb wrote:
> Out of curiosity. Is there a reason why `TuneCPU` defaults to "generic" and not to the CPU strings (from line 233)?
Probably not a good reason as the moment. But in the future tune=generic is going to be the tuning feature flags from the x86-64 or pentium4 cpu. So for llc purposes I'm likely going to have to pick something like "i386" as the tune CPU when the string is empty to avoid changing tests.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85165/new/

https://reviews.llvm.org/D85165