[llvm] [X86] Skip AMX type lowering when AMX is not used (PR #92910)

via llvm-commits llvm-commits at lists.llvm.org
Wed May 22 04:31:12 PDT 2024


================
@@ -1230,6 +1238,14 @@ class X86LowerAMXTypeLegacyPass : public FunctionPass {
   }
 
   bool runOnFunction(Function &F) override {
+    // Performance optimization: most code doesn't use AMX, so return early if
+    // there are no instructions that produce AMX values. This is sufficient, as
+    // AMX arguments and constants are not allowed -- so any producer of an AMX
+    // value must be an instruction.
+    // TODO: find a cheaper way for this, without looking at all instructions.
+    if (!containsAMXCode(F))
----------------
aengelke wrote:

>From what I see on AMX-related passes:

- IR LowerAMXTypes, always iterates over IR a few times (thus this PR)
- IR LowerAMXIntrinsics, off by default (technically it's in the pipeline, but it returns early if `-enable-x86-scalar-amx` is not set) => not relevant.
- MIR FastPreTileConfig (O0), iterates over all virtual regs, exits early if no tile reg exists (NB: iterating over registers is much faster than iterating over instructions)
- MIR PreTileConfig (non-O0), iterates over all machine instrs, returns early if no tile reg def or use exists (not benchmarkd)
- MIR FastTileConfig (O0), iterates over all machine instrs, returns early if no tile reg def or use exists (quite expensive for doing nothing)
- MIR TileConfig (non-O0), iterates over all machine instrs, returns early if VirtRegMap shape map is empty (not benchmarked, but probably ok)
- MIR LowerTileCopy, iterates over all MBB live ins, returns early if no live in is a tile reg

So there is one IR pass and three(five) MIR passes that would benefit from such information. In MIR, X86MachineFunctionInfo seems like a good place and we could trivially set a "uses AMX" flag during ISel when selecting intrinsics. (X86MFI already has AMX-related HasVirtualTileReg, so adding something like UsesAMX would seem good.) MFI is (AIUI) not available before ISel and therefore not in the IR pass. I can (and probably will) implement such a change in a separate PR.

Adding a new analysis pass just for the IR pass feels like overkill for 1 bit of information per function and from the existing immutable passes and passes preserved by MachineFunctionPass, nothing seems like a good fit.

https://github.com/llvm/llvm-project/pull/92910


More information about the llvm-commits mailing list