[clang] [AArch64][SME] Remove immediate argument restriction for svldr and svstr (PR #68565)
Sander de Smalen via cfe-commits
cfe-commits at lists.llvm.org
Tue Oct 10 01:23:34 PDT 2023
================
@@ -9665,18 +9665,34 @@ Value *CodeGenFunction::EmitSMEZero(const SVETypeFlags &TypeFlags,
Value *CodeGenFunction::EmitSMELdrStr(const SVETypeFlags &TypeFlags,
SmallVectorImpl<Value *> &Ops,
unsigned IntID) {
- if (Ops.size() == 3) {
- Function *Cntsb = CGM.getIntrinsic(Intrinsic::aarch64_sme_cntsb);
- llvm::Value *CntsbCall = Builder.CreateCall(Cntsb, {}, "svlb");
- llvm::Value *MulVL = Builder.CreateMul(
- CntsbCall,
- Builder.getInt64(cast<llvm::ConstantInt>(Ops[2])->getZExtValue()),
- "mulvl");
-
- Ops[1] = Builder.CreateGEP(Int8Ty, Ops[1], MulVL);
- Ops[0] = EmitTileslice(Ops[0], Ops[2]);
- Ops.erase(&Ops[2]);
- }
+ if (Ops.size() == 2) {
+ // Intrinsics without a vecnum also use this function, so just provide 0
+ Ops.push_back(Ops[1]);
+ Ops[1] = Builder.getInt32(0);
+ } else {
+ int Imm = -1;
+ if (ConstantInt* C = dyn_cast<ConstantInt>(Ops[2]))
+ if (C->getZExtValue() <= 15)
+ Imm = C->getZExtValue();
+
+ if (Imm != -1) {
----------------
sdesmalen-arm wrote:
Rather than checking this in Clang, I think it makes more sense to handle this in SelectionDAG.
The reason for that is that when someone uses this intrinsic inside a loop like this:
```
for(int i=0; i<N; ++i)
svldr_vnum_za(slice, ptr, i);
```
And in LLVM IR, the compiler would choose to unroll the loop, we'd end up with something like this:
```
for(int i=0; i<N; i+=4) {
svldr_vnum_za(slice, ptr, i);
svldr_vnum_za(slice, ptr, i+1);
svldr_vnum_za(slice, ptr, i+2);
svldr_vnum_za(slice, ptr, i+3);
}
```
It would be good if we could if we could add `i` to `slice` and use the immediate forms of the instructions.
If we have an intrinsic like:
```void @llvm.aarch64.sme.ldr(i32 %tileslice, ptr %base, i64 %vnum)```
Then for the instruction which takes the following inputs:
```(ins MatrixIndexGPR32Op12_15:$slice_base,
sme_elm_idx0_15:$slice_idx,
GPR64sp:$ptr,
imm0_15:$ptr_idx),
```
You can do custom matching in ISel by handling three different cases:
* `%vnum` is a constant between 0-15. This will leave `$slice_base` and `$ptr` untouched, and will use the matched immediate (let's call it $imm0_15) for the two immediates taken by the instruction (`slice_idx` and `ptr_idx`).
* `%vnum` is an ADD value of an opaque value and an immediate between 0-15. This will match to the instruction where the opaque value is added to `$slice_base` and `$ptr` with the remaining immediate being used for `$slice_idx` and `$ptr_idx`.
* `%vnum` is an opaque value. This will match to the instruction where `%vnum` is added to both the `$slice_base` and the `$ptr` parameter, with `$slice_idx` and `$ptr_idx` being `0`.
(note that when adding the `vnum` to `$ptr`, that it needs to be scaled by `cntd`)
https://github.com/llvm/llvm-project/pull/68565
More information about the cfe-commits
mailing list