[llvm-dev] Extend TruncInstCombine class in Aggressive Instruction Combine pass to handle users in a different truncation DAG
Renlin Li via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 14 07:54:36 PDT 2018
Hello there,
Multiple users inside the same DAG has been handled by the new pass
and implemented in TruncInstCombine class.
But I wonder if it possible to extend TruncInstCombine to handle cases
where the DAG ending nodes (ZExt/SExt) are used in a different
truncate DAG.
For example, a simple case like this:
define void @test(i8* noalias nocapture readonly %src0_ptr, i8*
noalias nocapture readonly %src1_ptr, i16* noalias nocapture %dst_ptr)
{
%1 = load i8, i8* %src1_ptr, align 1
%2 = load i8, i8* %src0_ptr, align 1
%3 = zext i8 %2 to i32
%4 = zext i8 %1 to i32
%5 = mul nuw nsw i32 %3, %4
%6 = trunc i32 %5 to i16
%7 = getelementptr inbounds i8, i8* %src0_ptr, i64 1
%8 = load i8, i8* %7, align 1
%9 = zext i8 %8 to i32
%10 = mul nuw nsw i32 %9, %4
%11 = trunc i32 %10 to i16
store i16 %6, i16* %dst_ptr, align 2
%12 = getelementptr inbounds i16, i16* %dst_ptr, i64 1
store i16 %11, i16* %12, align 2
ret void
}
There are more complicated cases where there is a chain of such dependency.
I have an initial idea to do this, it requires minimum change to
current TruncInstCombine class.
1, Record TruncInst which could be shrinked, but blocked by multiple-use values.
In the end, a list of such pairs are collected.
2, If the list is not empty, try to merge TruncInst DAG to reduce the
number of values used outside of the DAG.
2,1 If all user of the value are inside a shrinkable DAG, merge the
DAGs to eliminate this dependency. This will update the dependency of
the new DAG, keep doing it until there is no value users outside of
the DAG. The new big DAG which contains multiple TruncInst could be
shrinked together. Apply the transformation, goto step 3
2,3 Otherwise, goto step 3.
3, remove related TruncInsts from list. goto step 2.
Probably, a cost function could be added to balance the number of
shrinkable truncate instructions against the number of copies need to
make.
And by the way, extension might be free if it could be combine with
the users in some architecture.
I didn't feel very comfortable with my approach, is there any
suggestion how this could be done better?
Thanks,
Renlin
More information about the llvm-dev
mailing list