[PATCH] D120462: [AArch64InstrInfo.td] - Lowering fix for cttz intrinsic
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 27 10:34:13 PST 2022
craig.topper added a comment.
In D120462#3347920 <https://reviews.llvm.org/D120462#3347920>, @gsocshubham wrote:
> In D120462#3343650 <https://reviews.llvm.org/D120462#3343650>, @dmgreen wrote:
>
>>> TODO - Add unit test
>>
>> Yeah, this needs some tests to explain what you think it is improving. It seems to just add an extra instructions where it isn't needed? Was it intended to improve some larger code pattern?
>
> It was intended to generate assembly as of GCC -
>
> f(unsigned int):
> rbit w0, w0
> clz w0, w0
> and w0, w0, 31
> ret
>
> but patch present here - https://reviews.llvm.org/D113291 is generating below assembly -
>
> rbit w8, w0
> cmp w0, #0
> clz w8, w8
> csel w0, wzr, w8, eq
> ret
>
> Hence, to map with GCC assembly, I had modified patch to remove clz and csel instruction and generate below assembly -
>
> rbit w8, w0
> clz w8, w8
> and w0, w8, #0x1f
> ret
How does the csel get removed? There's no code here that detects that. To me it looks like it will now emit
rbit w8, w0
cmp w0, #0
clz w8, w8
and w0, w8, #0x1f
csel w0, wzr, w8, eq
ret
It also looks like this patch would break the codegen for
define i32 @foo(i32 %x) {
%a = call i32 @llvm.cttz.i32(i32 %x, i1 0)
ret i32 %a
}
As you'll now add an AND that shouldn't be there.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120462/new/
https://reviews.llvm.org/D120462
More information about the llvm-commits
mailing list