[llvm-dev] IPRA and conditionally reserved registers
Jonas Paulsson via llvm-dev
llvm-dev at lists.llvm.org
Tue May 1 06:15:23 PDT 2018
Hi Kit,
I see you have been working on IPRA (https://reviews.llvm.org/D45308),
and would therefore like to bring up an issue with it I am looking into
on SystemZ (see https://reviews.llvm.org/D46232).
I first realized that %r14, the return register, must be saved and
restored with IPRA enabled, since otherwise the function can't return.
This is a callee saved register so without IPRA this always gets saved,
but if that is omitted and the function has no calls itself, we have to
have a second check to add it with IPRA.
A related issue (the topic of this mail) is then the frame pointer
(%r11). If caller uses FP, %r11 becomes reserved and is expected to
never be allocated. But if callee does not have an FP, it is free to
allocate it. So the Collector / Propagate passes transform the regmask
on the call to express that %r11 is clobbered, but the problem is that
the register allocator does not care about %r11 in caller, since it is
reserved. This seems currently unhandled, and this is what I would like
to ask about.
My first idea was to let callee always save/restore %r11, since it may
be reserved in some caller. As Uli pointed out that is very
conservative, and it seems to me also not be in agreement with IPRA,
where the save/restore is generally done by caller as much as possible.
So the question is how this should get handled in caller?
I would like to see RegUsageInfoPropagate compare the unmodified regmask
with the updated one, and then make sure that any registers reserved in
the current function being clobbered by the call as a result of IPRA
(updated regmask), should now be copied to and from a virtual register
around that call, but this is not being done. Am I missing something here?
So, in short, should these registers be saved/restored in caller, and if
so how should this be done?
/Jonas
Attached is a test case where this happens on SystemZ.
bin/llc -mcpu=z13 -enable-ipra ./tc_ipra_fp.ll -o out.s
-------------- next part --------------
%0 = type { [3 x i64] }
; Function Attrs: norecurse nounwind
declare dso_local fastcc signext i32 @foo(i16*, i32 signext) unnamed_addr
; Function Attrs: norecurse nounwind
define internal fastcc void @fun1(i16*, i16* nocapture) unnamed_addr #0 {
%3 = load i16, i16* undef, align 2
%4 = shl i16 %3, 4
%5 = tail call fastcc signext i32 @foo(i16* nonnull %0, i32 signext 5)
%6 = or i16 0, %4
%7 = or i16 %6, 0
store i16 %7, i16* undef, align 2
%8 = getelementptr inbounds i16, i16* %0, i64 5
%9 = load i16, i16* %8, align 2
store i16 %9, i16* %1, align 2
ret void
}
; Function Attrs: nounwind
define fastcc void @fun0(i8* nocapture readonly, i16* nocapture, i32 signext) unnamed_addr {
%4 = alloca i8, i64 undef, align 8
call fastcc void @fun1(i16* nonnull undef, i16* %1)
ret void
}
attributes #0 = { norecurse nounwind "no-frame-pointer-elim"="false" }
More information about the llvm-dev
mailing list