[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.
Pengfei Wang via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Mar 3 05:52:55 PST 2021
pengfei accepted this revision.
pengfei added a comment.
This revision is now accepted and ready to land.
LGTM with some nitpicks 😊
================
Comment at: llvm/include/llvm/CodeGen/Passes.h:496
+
+ /// The pass transform amx intrinsics to scalar operation if the function has
+ /// optnone attribute or it is O0.
----------------
transforms
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:9
+//
+/// \file Pass to transform amx intrinsics to scalar operation.
+/// This pass is only enabled with -O0. With -O0, the def of shape to amx
----------------
operations
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:10
+/// \file Pass to transform amx intrinsics to scalar operation.
+/// This pass is only enabled with -O0. With -O0, the def of shape to amx
+/// intrinsics is near the amx intrinsics code. We are not able to find a
----------------
We always enable it. Also need to mention optnone.
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:46
+static bool isV256I32Ty(Type *Ty) {
+ if (auto *FVT = dyn_cast<FixedVectorType>(Ty)) {
+ return FVT->getNumElements() == 256 &&
----------------
Curly brackets are not necessary here.
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:82
+ DTU.applyUpdatesPermissive({
+ {DominatorTree::Delete, Preheader, Tmp},
+ {DominatorTree::Insert, Header, Body},
----------------
Do we need to remove the successor? Isn't it still being dominated?
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:96
+
+template <bool isTileLoad>
+static Value *createTileLoadStoreLoops(BasicBlock *Start, BasicBlock *End,
----------------
IsTileLoad
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:117
+ BasicBlock *ColBody =
+ createLoop(RowBody, RowLatch, Col, B.getInt16(ColStep),
+ IntrinName + ".scalarize.cols", B, DTU, ColLoop, LI);
----------------
Use 1 directly?
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:121
+ BasicBlock *ColLoopLatch = ColBody->getSingleSuccessor();
+ BasicBlock *ColumnLoopHeader = ColBody->getSinglePredecessor();
+ BasicBlock *RowLoopHeader = RowBody->getSinglePredecessor();
----------------
Better use the same naming conversion, i.e. `ColLoopHeader`
================
Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:125-126
+ Value *CurrentCol = &*ColumnLoopHeader->begin();
+ FixedVectorType *V256I32Ty = FixedVectorType::get(B.getInt32Ty(), 256);
+ Type *EltTy = V256I32Ty->getElementType();
+
----------------
Better to change the order, e.g.
```
Type *EltTy = B.getInt32Ty();
FixedVectorType *V256I32Ty = FixedVectorType::get(EltTy, 256);
```
================
Comment at: llvm/test/CodeGen/X86/AMX/amx-low-intrinsics-no-amx-bitcast.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -mtriple=x86_64 -lower-amx-intrinsics %s -S | FileCheck %s
----------------
I think we should move the files to llvm/test/Transforms/
================
Comment at: llvm/test/CodeGen/X86/AMX/amx-type.ll:2
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -lower-amx-type %s -S | FileCheck %s
+; RUN: opt --codegen-opt-level=2 -mtriple=x86_64 -lower-amx-type %s -S | FileCheck %s
----------------
Why adding this? Is it O2 by default?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93594/new/
https://reviews.llvm.org/D93594
More information about the cfe-commits
mailing list