[PATCH] D141469: [AArch64][SVE] Add more intrinsics in 'isZeroingInactiveLanes'.

dewen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 16 23:48:32 PST 2023


dewen marked an inline comment as done.
dewen added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:305-311
+    case Intrinsic::aarch64_sve_orr_z:
+    case Intrinsic::aarch64_sve_orn_z:
+    case Intrinsic::aarch64_sve_and_z:
+    case Intrinsic::aarch64_sve_bic_z:
+    case Intrinsic::aarch64_sve_eor_z:
+    case Intrinsic::aarch64_sve_nor_z:
+    case Intrinsic::aarch64_sve_nand_z:
----------------
dewen wrote:
> paulwalker-arm wrote:
> > I don't believe the logical instructions zero inactive lanes either. You can see this by looking at the tests you've added.  The instructions are always "byte" based and so if the input general predicate contains garbage then so will the result. In this instance this is not a problem because from the ACLE's point of view we only support these intrinsics for `<vscale x 16 x i1>`[1] types and when used with such types they'll be no reinterprets to introduce the unnecessary `and` and so I think these entries should also be removed.
> > 
> > [1] I know the code generation works for other predicate types, but that's just a fluke based on some stale isel patterns (i.e. they'll never be generated via the clang builtins) that should probably be removed.
> I don't understand this part, but I think these logic instructions are going to put Inactive elements in the destination predicate register are set to zero. @paulwalker-arm
> 
> for example:
> {F26194894}
> {F26194900}
> {F26194902}
> {F26194903}
> {F26194906}
> {F26194911}
Thank you @paulwalker-arm. I understand my mistake now. I'll update the patch.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141469/new/

https://reviews.llvm.org/D141469



More information about the llvm-commits mailing list