[flang-commits] [llvm] [lldb] [mlir] [openmp] [flang] [mlir][Vector] Add patterns for efficient i4 -> i8 conversion emulation (PR #79494)
Benjamin Maxwell via flang-commits
flang-commits at lists.llvm.org
Fri Jan 26 07:31:58 PST 2024
MacDue wrote:
> It gets difficult to get this working for scalable at this level as we would have to introduce SVE or LLVM intrinsics to model the interleave in an scalable way.
There already are LLVM intrinsics for that, so I don't think it'd be hard to extend to support SVE:
I wrote this little test, which seemed to build fine, and generate reasonable looking code:
```mlir
func.func @test_sve_i4_extend(%inMem: memref<?xi4> ) -> vector<[8]xi32> {
%c0 = arith.constant 0 :index
%c4 = arith.constant 4 : i8
%in = vector.load %inMem[%c0] : memref<?xi4>, vector<[8]xi4>
%shift = vector.splat %c4 : vector<[4]xi8>
%0 = vector.bitcast %in : vector<[8]xi4> to vector<[4]xi8>
%1 = arith.shli %0, %shift : vector<[4]xi8>
%2 = arith.shrsi %1, %shift : vector<[4]xi8>
%3 = arith.shrsi %0, %shift : vector<[4]xi8>
%4 = "llvm.intr.experimental.vector.interleave2"(%2, %3) : (vector<[4]xi8>, vector<[4]xi8>) -> vector<[8]xi8>
%5 = arith.extsi %4 : vector<[8]xi8> to vector<[8]xi32>
return %5 : vector<[8]xi32>
}
```
->
```
test_sve_i4_extend:
ptrue p0.s
ld1sb { z0.s }, p0/z, [x1]
lsl z1.s, z0.s, #28
asr z0.s, z0.s, #4
asr z1.s, z1.s, #28
zip2 z2.s, z1.s, z0.s
zip1 z0.s, z1.s, z0.s
movprfx z1, z2
sxtb z1.s, p0/m, z2.s
sxtb z0.s, p0/m, z0.s
ret
```
I think in the vector dialect: `"llvm.intr.experimental.vector.interleave2` could nicely become `vector.scalable.interleave` :slightly_smiling_face:
https://github.com/llvm/llvm-project/pull/79494
More information about the flang-commits
mailing list