[llvm-bugs] [Bug 42746] New: Should CorrelatedValuePropagation pass reduce width of shifts?
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jul 24 11:27:10 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=42746
Bug ID: 42746
Summary: Should CorrelatedValuePropagation pass reduce width of
shifts?
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: unassignedbugs at nondot.org
Reporter: lebedev.ri at gmail.com
CC: llvm-bugs at lists.llvm.org
I'm currently looking into re-fixing
https://bugs.llvm.org/show_bug.cgi?id=42399,
and i'm currently stuck with a pattern like:
define i1 @test(i64 %storage, i32 %nbits) {
%skipnbits = sub nsw i32 64, %nbits
%skipnbitswide = zext i32 %skipnbits to i64
%datawide = lshr i64 %storage, %skipnbitswide
%data = trunc i64 %datawide to i32
%nbitsminusone = add nsw i32 %nbits, -1
%bitmask = shl i32 1, %nbitsminusone
%bitmasked = and i32 %bitmask, %data
%isbitunset = icmp eq i32 %bitmasked, 0
ret i1 %isbitunset
}
While the desired optimized result is:
define i1 @test(i64 %storage, i32 %nbits) {
%tmp = icmp sgt i64 %storage, -1
ret i1 %tmp
}
That transform is indeed correct:
Name: PR42399
%skipnbits = sub nsw i32 64, %nbits
%skipnbitswide = zext i32 %skipnbits to i64
%datawide = lshr i64 %storage, %skipnbitswide
%data = trunc i64 %datawide to i32
%nbitsminusone = add nsw i32 %nbits, -1
%bitmask = shl i32 1, %nbitsminusone
%bitmasked = and i32 %bitmask, %data
%isbitunset = icmp eq i32 %bitmasked, 0
=>
%isbitunset = icmp sgt i64 %storage, -1
https://rise4fun.com/Alive/hUu
The problem is those truncations around shifts.
The current legality check i've come up with is:
https://rise4fun.com/Alive/M5vF
Name: one truncation 0 - the original widest input should be losslessly
truncatable, or the other input should be '1'
Pre: C1+C2 u< 64 && ( (countLeadingZeros(C11) u>= (64-32)) ||
(countLeadingZeros(C22) u>= (32-1)) )
%C1_64 = zext i8 C1 to i64
%C2_32 = zext i8 C2 to i32
%old_shift_of_x = lshr i64 C11, %C1_64
%old_shift_of_y = shl i32 C22, %C2_32
%old_trunc_of_shift_of_x = trunc i64 %old_shift_of_x to i32
%old_masked = and i32 %old_trunc_of_shift_of_x, %old_shift_of_y
%r = icmp ne i32 %old_masked, 0
=>
%C1_64 = zext i8 C1 to i64
%C2_64 = zext i8 C2 to i64
%new_shamt = add i64 %C1_64, %C2_64
%new_y_wide = zext i32 C22 to i64
%new_shift = shl i64 %new_y_wide, %new_shamt
%new_masked = and i64 %new_shift, C11
%r = icmp ne i64 %new_masked, 0
I.e. the transform can be done if the truncation could have been threaded over
the shift
in the first place.
But we don't seem to do that currently, and we don't do the opposite transform:
https://godbolt.org/z/qYlMJP
In CorrelatedValuePropagation.cpp i only see reduction of udiv/urem width.
Should it be taught to also reduce width of shifts?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190724/a11b50bb/attachment.html>
More information about the llvm-bugs
mailing list