[PATCH] D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint

Erich Keane via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 10 13:49:55 PST 2022


erichkeane created this revision.
erichkeane added a reviewer: aaron.ballman.
Herald added a subscriber: pengfei.
Herald added a project: All.
erichkeane requested review of this revision.

Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it.  This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def.  At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or wiht different feature lists!).

If this is not done, there is two potential performance issues with the 
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the even that the user spelled VendorI's processor, and the feature
list allows it to run on VendorA's processor of similar features, AND there
is an optimization opportunity for VendorIs that negatively affects "A"s,
the optimizer will likely choose to do so.  This can be fixed by adding an
alias to X86TargetParser.def.


https://reviews.llvm.org/D121410

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGen/attr-cpuspecific-avx-abi.c
  clang/test/CodeGen/attr-cpuspecific.c


Index: clang/test/CodeGen/attr-cpuspecific.c
===================================================================
--- clang/test/CodeGen/attr-cpuspecific.c
+++ clang/test/CodeGen/attr-cpuspecific.c
@@ -340,5 +340,8 @@
 void OrderDispatchUsageSpecific(void) {}
 
 // CHECK: attributes #[[S]] = {{.*}}"target-features"="+avx,+cmov,+crc32,+cx8,+f16c,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
+// CHECK-SAME: "tune-cpu"="ivybridge"
 // CHECK: attributes #[[K]] = {{.*}}"target-features"="+adx,+avx,+avx2,+avx512cd,+avx512er,+avx512f,+avx512pf,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
+// CHECK-SAME: "tune-cpu"="knl"
 // CHECK: attributes #[[O]] = {{.*}}"target-features"="+cmov,+cx8,+mmx,+movbe,+sse,+sse2,+sse3,+ssse3,+x87"
+// CHECK-SAME: "tune-cpu"="atom"
Index: clang/test/CodeGen/attr-cpuspecific-avx-abi.c
===================================================================
--- clang/test/CodeGen/attr-cpuspecific-avx-abi.c
+++ clang/test/CodeGen/attr-cpuspecific-avx-abi.c
@@ -23,4 +23,6 @@
 // CHECK: define{{.*}} @foo.V() #[[V:[0-9]+]]
 
 // CHECK: attributes #[[A]] = {{.*}}"target-features"="+avx,+crc32,+cx8,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
+// CHECK-SAME: "tune-cpu"="generic"
 // CHECK: attributes #[[V]] = {{.*}}"target-features"="+avx,+avx2,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
+// CHECK-SAME: "tune-cpu"="core_4th_gen_avx"
Index: clang/lib/CodeGen/CodeGenModule.cpp
===================================================================
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -2060,6 +2060,12 @@
           getTarget().isValidCPUName(ParsedAttr.Tune))
         TuneCPU = ParsedAttr.Tune;
     }
+
+    if (SD) {
+      // Apply the given CPU name as the 'tune-cpu' so that the optimizer can
+      // favor this processor.
+      TuneCPU = SD->getCPUName(GD.getMultiVersionIndex())->getName();
+    }
   } else {
     // Otherwise just add the existing target cpu and target features to the
     // function.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D121410.414488.patch
Type: text/x-patch
Size: 2170 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20220310/7f09d33d/attachment-0001.bin>


More information about the cfe-commits mailing list