[PATCH] D158548: Limit bswap and bitreverse matching to legal integer width
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 23 13:12:14 PDT 2023
craig.topper added a comment.
In D158548#4611411 <https://reviews.llvm.org/D158548#4611411>, @vmustya wrote:
> In D158548#4611202 <https://reviews.llvm.org/D158548#4611202>, @nikic wrote:
>
>> Can the expansion produce better code than it does now, or does some important information get lost that prevents this?
>>
>> Is the issue related to weird types like i9, or would this also be a problem for (say) i8?
>
> I've only seen the issue with the weird types like i9. I've added the minimal reproducer as a lit test case:
>
> define i32 @illegal_width(i32 %x) {
> %b0 = and i32 %x, 1
> %b1 = and i32 %x, 2
> %shift1 = mul nuw nsw i32 %b1, 64
> %shift0 = shl nuw nsw i32 %b0, 8
> %reverse = add i32 %shift0, %shift1
> ret i32 %reverse
> }
>
> For x86_64 LLVM generates the following assembly:
>
> rol di, 8
> mov eax, edi
> shl eax, 6
> and eax, 16384
> shl edi, 5
> and edi, 16384
> lea eax, [rdi + 2*rax]
> shr eax, 7
> ret
>
> With the patch applied, the assembly looks much better:
>
> mov eax, edi
> and eax, 2
> shl eax, 6
> and edi, 1
> shl edi, 8
> or eax, edi
> ret
The backend is expanding to i16 reverse sequence followed by a shift. Only 2 bits of that are demanded, but is suspect multiple uses prevents SimplifyDemandedBit from optimizing it.
There only 2 bits involved in the reverse here. Perhaps we shouldn't form bit reverse from something so sparse?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158548/new/
https://reviews.llvm.org/D158548
More information about the llvm-commits
mailing list