[PATCH] D86819: [PowerPC][Power10] Implementation of 128-bit Binary Vector Rotate builtins
Nemanja Ivanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 10 07:48:24 PDT 2020
nemanjai added inline comments.
================
Comment at: clang/lib/Headers/altivec.h:7743
+ return __builtin_altivec_vrlqnm(__a, ((__c << ShiftMask) |
+ (__b << ShiftRotation)));
+}
----------------
While correct, this implementation will require two constant pool loads (for the two shift amounts), then two `vrlq`'s to shift the two vectors and finally an `xxlor` to OR them together. We should be able to do this with a single constant pool load and `vperm`.
Presumably the implementation would be something like:
```
// Merge __b and __c using an appropriate shuffle.
vector unsigned char TmpB = (vector unsigned char)__b;
vector unsigned char TmpC = (vector unsigned char)__c;
vector unsigned char MaskAndShift =
#ifdef __LITTLE_ENDIAN__
__builtin_shufflevector(TmpB, TmpC, -1, -1, -1, -1, -1, -1, -1, -1, 16, 1,
0, -1, -1, -1, -1, -1);
#else
__builtin_shufflevector(TmpB, TmpC, -1, -1, -1, -1, -1, 30, 31, 15, -1,
-1, -1, -1, -1, -1, -1, -1);
#endif
return __builtin_altivec_vrlqnm(__a, MaskAndShift);
```
(but of course, double-check that the numbers are correct).
================
Comment at: clang/test/CodeGen/builtins-ppc-p10vector.c:996
+ // CHECK-COMMON-LABEL: @test_vec_rlnm_s128(
+ // CHECK-COMMON: call <1 x i128> @llvm.ppc.altivec.vrlqnm(<1 x i128>
+ // CHECK-COMMON-NEXT: ret <1 x i128>
----------------
Please show the shift in the test case as well.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86819/new/
https://reviews.llvm.org/D86819
More information about the llvm-commits
mailing list